EleutherAI/llemma_7b

Warm
Public
7B
FP8
4096
License: llama2
Hugging Face
Overview

Llemma 7B: A Language Model for Mathematics

Llemma 7B, developed by EleutherAI, is a 7 billion parameter language model specialized in mathematical reasoning and problem-solving. It was built upon the Code Llama 7B architecture and underwent extensive training on the Proof-Pile-2 dataset, comprising 200 billion tokens, to enhance its mathematical capabilities.

Key Capabilities

  • Chain-of-Thought Mathematical Reasoning: Llemma 7B demonstrates strong performance in complex mathematical reasoning tasks, outperforming Llama-2 and Code Llama of similar sizes on benchmarks like GSM8k, OCW, MMLU-STEM, SAT, and MATH.
  • Computational Tool Use: The model is proficient in utilizing computational tools, such as Python and formal theorem provers, for mathematical applications.
  • Competitive Performance: When controlling for model size, Llemma 7B surpasses Minerva models on several mathematical benchmarks, particularly in chain-of-thought reasoning.

Use Cases

  • Mathematical Problem Solving: Ideal for tasks requiring step-by-step mathematical reasoning.
  • Assisting with Formal Proofs: Can be used in conjunction with theorem provers.
  • Educational Tools: Potentially useful for generating explanations or solutions for mathematical problems.