CausalLM/34b-beta

TEXT GENERATIONConcurrency Cost:2Model Size:34BQuant:FP8Ctx Length:32kPublished:Feb 6, 2024License:gpl-3.0Architecture:Transformer0.1K Open Weights Cold

CausalLM/34b-beta is a 34 billion parameter causal language model developed by CausalLM, featuring a 32768 token context length. This model is instruction-tuned using the ChatML format and achieves an MT-Bench score of 8.5. It is noted for its low MMLU contamination score of 0.38, indicating a focus on original data, and is recommended for general language generation tasks.

Loading preview...

CausalLM/34b-beta: An Instruction-Tuned 34B Parameter Model

CausalLM/34b-beta is a 34 billion parameter causal language model designed for general language generation and understanding tasks. It utilizes the ChatML prompt format for instruction-following, making it suitable for conversational AI and task-oriented applications.

Key Capabilities & Features

  • Strong Instruction Following: Achieves an MT-Bench score of 8.5, indicating robust performance in multi-turn conversations and complex instructions.
  • Low Data Contamination: Demonstrates a low MMLU contamination score of 0.38, suggesting a training dataset with minimal overlap with common benchmarks, which can lead to more generalized knowledge.
  • Optimized for Transformers Inference: While the model has known precision issues that require specific inference methods, it performs best with the Transformers library or llama.cpp with q8_0 quantization for optimal output quality.
  • 32K Context Window: Supports a substantial context length of 32,768 tokens, enabling the processing of longer inputs and generating more coherent, extended responses.

Important Considerations

  • Inference Frameworks: Users are advised to avoid "accelerated inference frameworks" like VLLM temporarily due to precision issues that can degrade output quality. Transformers or llama.cpp (q8_0) are recommended.
  • Repetition Penalty: The model is designed to perform optimally without a repetition penalty.
  • Training Data: While contamination is minimized, the README notes that some level of training data contamination is unavoidable due to cost constraints.

Good for

  • Applications requiring strong instruction-following capabilities.
  • General text generation, summarization, and question-answering.
  • Developers seeking a 34B model with a focus on original data and robust conversational performance.