Overview
Overview
nvidia/AceMath-RL-Nemotron-7B is a 7.6 billion parameter language model developed by NVIDIA, specifically engineered for advanced mathematical reasoning. It is uniquely trained using reinforcement learning (RL), starting from the Deepseek-R1-Distilled-Qwen-7B base model. This RL-centric approach has significantly enhanced its capabilities in solving complex math problems.
Key Capabilities and Performance
- Exceptional Math Reasoning: Achieves 69.0% Pass@1 accuracy on AIME 2024 and 53.6% Pass@1 accuracy on AIME 2025, representing substantial gains over its base model and competitive alternatives.
- Generalization to Coding: The math-focused RL training surprisingly improves its coding accuracy, reaching 44.4% Pass@1 on LiveCodeBench, showcasing the broad applicability of its learned reasoning skills.
- Competitive Benchmarking: Outperforms several comparable models on AIME benchmarks and shows strong results on other math datasets like GSM8K, MATH500, and Olympiad Bench.
Usage Recommendations
- Prompting: It is recommended to place all instructions directly in the user prompt, avoiding a separate system prompt.
- Math Question Format: For optimal performance on math questions, use the format:
<|begin\u2596of\u2596sentence|><|User|>{math_question}\nPlease reason step by step, and put your final answer within \boxed{}.<|Assistant|>\<think\>\n.
This model is ideal for applications requiring robust mathematical problem-solving and can also be leveraged for coding tasks where logical reasoning is paramount. Further details on the training recipe and data curation are available in the NVIDIA blog post.