nvidia/AceMath-RL-Nemotron-7B

Warm
Public
7.6B
FP8
131072
License: nvidia-open-model-license
Hugging Face
Overview

Overview

nvidia/AceMath-RL-Nemotron-7B is a 7.6 billion parameter language model developed by NVIDIA, specifically engineered for advanced mathematical reasoning. It is uniquely trained using reinforcement learning (RL), starting from the Deepseek-R1-Distilled-Qwen-7B base model. This RL-centric approach has significantly enhanced its capabilities in solving complex math problems.

Key Capabilities and Performance

  • Exceptional Math Reasoning: Achieves 69.0% Pass@1 accuracy on AIME 2024 and 53.6% Pass@1 accuracy on AIME 2025, representing substantial gains over its base model and competitive alternatives.
  • Generalization to Coding: The math-focused RL training surprisingly improves its coding accuracy, reaching 44.4% Pass@1 on LiveCodeBench, showcasing the broad applicability of its learned reasoning skills.
  • Competitive Benchmarking: Outperforms several comparable models on AIME benchmarks and shows strong results on other math datasets like GSM8K, MATH500, and Olympiad Bench.

Usage Recommendations

  • Prompting: It is recommended to place all instructions directly in the user prompt, avoiding a separate system prompt.
  • Math Question Format: For optimal performance on math questions, use the format: <|begin\u2596of\u2596sentence|><|User|>{math_question}\nPlease reason step by step, and put your final answer within \boxed{}.<|Assistant|>\<think\>\n.

This model is ideal for applications requiring robust mathematical problem-solving and can also be leveraged for coding tasks where logical reasoning is paramount. Further details on the training recipe and data curation are available in the NVIDIA blog post.