sail/Qwen2.5-Math-7B-Oat-Zero

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Mar 17, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

The sail/Qwen2.5-Math-7B-Oat-Zero model is a 7.6 billion parameter language model developed by sail, based on the Qwen2.5-Math-7B architecture. It is specifically fine-tuned using the minimalist R1-Zero recipe and Dr. DRPO algorithm on level 3-5 questions from the MATH dataset. This model is optimized for advanced mathematical reasoning and problem-solving tasks, demonstrating strong performance on widely used math benchmarks.

Loading preview...

Model Overview

The sail/Qwen2.5-Math-7B-Oat-Zero is a 7.6 billion parameter language model derived from the Qwen2.5-Math-7B base model. Developed by sail, this model is distinguished by its training methodology, which utilizes the minimalist R1-Zero recipe and the Dr. DRPO algorithm. Its training data specifically focuses on mathematical problems, incorporating level 3-5 questions from the MATH dataset.

Key Capabilities

  • Advanced Mathematical Reasoning: Optimized for complex mathematical problem-solving, as evidenced by its training on challenging MATH dataset questions.
  • Specialized Fine-tuning: Employs the R1-Zero recipe and Dr. DRPO algorithm for targeted performance enhancement in mathematical domains.
  • Benchmark Performance: Demonstrates strong results on widely recognized math benchmarks, indicating its proficiency in quantitative tasks.

Good For

  • Mathematical Problem Solving: Ideal for applications requiring precise and step-by-step mathematical reasoning.
  • Research in LLM Training: Useful for researchers exploring the effectiveness of minimalist training recipes like R1-Zero for domain-specific tasks.
  • Educational Tools: Can be integrated into systems designed to assist with or evaluate solutions to advanced math problems.