sail/Llama-3.2-3B-Oat-Zero

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.2BQuant:BF16Ctx Length:32kPublished:Mar 17, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

sail/Llama-3.2-3B-Oat-Zero is a 3.2 billion parameter language model developed by sail, fine-tuned using the minimalist R1-Zero recipe and Dr. DRPO algorithm. Based on lkevinzc/Llama-3.2-3B-NuminaQA, it is specifically optimized for mathematical reasoning tasks, trained on level 3-5 questions from the MATH dataset. This model features a 32768-token context length and demonstrates strong performance on widely used math benchmarks.

Loading preview...

sail/Llama-3.2-3B-Oat-Zero: Specialized for Mathematical Reasoning

sail/Llama-3.2-3B-Oat-Zero is a 3.2 billion parameter model developed by sail, focusing on enhanced mathematical problem-solving. It is built upon the lkevinzc/Llama-3.2-3B-NuminaQA base model and fine-tuned using the novel R1-Zero recipe and Dr. DRPO algorithm, as detailed in their research paper.

Key Capabilities

  • Mathematical Reasoning: Specifically trained on challenging level 3-5 questions from the MATH dataset.
  • R1-Zero Recipe: Utilizes a minimalist training approach for efficient and targeted performance.
  • Dr. DRPO Algorithm: Incorporates a specific algorithm for its fine-tuning process.
  • Context Length: Supports a substantial context window of 32768 tokens.

Performance

Evaluation results indicate strong performance on various math benchmarks, positioning it as a capable model for complex quantitative tasks. The model employs a specific R1 template for prompting, designed to facilitate step-by-step reasoning and structured answers.

Use Cases

This model is particularly well-suited for applications requiring precise mathematical problem-solving and reasoning, especially in educational tools, research, or any domain where accurate numerical and logical deduction is critical.