Light-R1-32B-DS: A Specialized Math Model
Light-R1-32B-DS is a 32 billion parameter model from Qihoo360, specifically designed for advanced mathematical reasoning. It is fine-tuned from DeepSeek-R1-Distill-Qwen-32B and distinguishes itself by achieving near-state-of-the-art performance in math benchmarks with a remarkably small training dataset.
Key Capabilities & Features
- Exceptional Math Performance: Achieves scores of 78.1 on AIME24 and 65.9 on AIME25, positioning it as a strong contender in mathematical problem-solving among 32B models.
- Efficient Training: Developed using only 3,000 SFT (Supervised Fine-Tuning) data points, highlighting its data efficiency and the quality of the training methodology.
- Robust Data Decontamination: Underwent thorough data decontamination processes, including exact matching and N-gram matching, to ensure benchmark integrity and prevent contamination from test sets.
- Technical Report Available: Detailed insights into its development and performance are provided in its technical report.
When to Use This Model
- Mathematical Reasoning Tasks: Ideal for applications requiring high accuracy in complex mathematical problem-solving, competitive math, and scientific calculations.
- Resource-Efficient Fine-tuning: Demonstrates that high performance can be achieved with limited, high-quality SFT data, making it a valuable reference for efficient model development.
- Benchmarking and Research: Useful for researchers and developers interested in advanced math capabilities and data-efficient training techniques.