nvidia/AceReason-Nemotron-14B

Warm
Public
14B
FP8
32768
License: nvidia-open-model-license
Hugging Face
Overview

AceReason-Nemotron-14B: RL-Enhanced Math and Code Reasoning

AceReason-Nemotron-14B is a 14 billion parameter model from NVIDIA, distinguished by its training methodology: it is developed entirely through reinforcement learning (RL), building upon the DeepSeek-R1-Distilled-Qwen-14B base. This unique approach has significantly boosted its capabilities in complex math and code reasoning tasks.

Key Capabilities & Performance

  • Reinforcement Learning Focus: The model's core differentiator is its RL-only training, which has been shown to elicit and push the limits of foundational reasoning abilities.
  • Exceptional Math Reasoning: Achieves 78.6% on AIME 2024 (an +8.9% improvement) and 67.4% on AIME 2025 (a +17.4% improvement), demonstrating strong performance in advanced mathematical problem-solving.
  • Robust Code Generation: Scores 61.1% on LiveCodeBench v5 (+8%) and 54.9% on LiveCodeBench v6 (+7%), alongside 2024 on Codeforces (+543), indicating proficiency in generating and understanding code.
  • Strategic RL Training: NVIDIA's research shows that initial RL training on math-only prompts enhances both math and code reasoning, with subsequent code-only RL further improving code performance while maintaining math scores.
  • High Context Length: Supports a context length of 32768 tokens, beneficial for handling intricate problems requiring extensive context.

Usage Recommendations

  • No System Prompt: Instructions should be integrated directly into the user prompt.
  • Math Instruction: For math questions, use "Please reason step by step, and put your final answer within \boxed{}."
  • Code Instruction: For code questions, follow specific formatting for starter code or general code instructions to ensure optimal output.