EleutherAI/llemma_34b

TEXT GENERATIONConcurrency Cost:2Model Size:34BQuant:FP8Ctx Length:32kPublished:Sep 27, 2023License:llama2Architecture:Transformer0.1K Open Weights Cold

EleutherAI's Llemma 34B is a 34 billion parameter language model for mathematics, initialized from Code Llama 34B and further trained on the Proof-Pile-2 dataset for 50 billion tokens. This model specializes in mathematical reasoning, outperforming Llama-2, Code Llama, and similarly sized Minerva models on chain-of-thought math tasks. It demonstrates strong capabilities in using computational tools for mathematics, including Python and formal theorem provers, making it suitable for advanced mathematical problem-solving.

Loading preview...

Llemma 34B: A Specialized Language Model for Mathematics

Llemma 34B, developed by EleutherAI, is a 34 billion parameter language model specifically designed for mathematical tasks. It was initialized using the weights of Code Llama 34B and subsequently fine-tuned on the extensive Proof-Pile-2 dataset for 50 billion tokens. This specialized training has equipped Llemma 34B with advanced capabilities in mathematical reasoning and problem-solving.

Key Capabilities & Performance

Llemma models excel in several areas of mathematics:

  • Chain-of-Thought Mathematical Reasoning: Llemma 34B significantly outperforms Llama-2 and Code Llama of similar sizes on various chain-of-thought math benchmarks. For instance, it achieves 51.5% on GSM8k and 25.0% on MATH, surpassing Code Llama 34B's 29.6% and 12.2% respectively. When compared to Minerva 62B, Llemma 34B shows competitive performance, especially with majority voting strategies (e.g., 69.3% on GSM8k maj@100).
  • Computational Tool Use: The model demonstrates strong proficiency in utilizing computational tools for mathematical tasks, including Python and formal theorem provers. Further details on its tool use and formal theorem proving evaluations are available in the associated research paper.

When to Use Llemma 34B

Llemma 34B is particularly well-suited for applications requiring:

  • Advanced mathematical problem-solving.
  • Automated theorem proving and formal verification.
  • Generating and verifying mathematical proofs.
  • Educational tools for mathematics.