nvidia/OpenMath-Nemotron-14B

Cold
Public
14.8B
FP8
131072
License: cc-by-4.0
Hugging Face
Overview

Overview

OpenMath-Nemotron-14B is a 14.8 billion parameter model developed by NVIDIA, built upon the Qwen2.5 architecture and fine-tuned using the OpenMathReasoning dataset. It is specifically designed for advanced mathematical reasoning tasks and has demonstrated strong performance on challenging benchmarks.

Key Capabilities

  • State-of-the-Art Mathematical Reasoning: Achieves high scores on benchmarks such as AIME24, AIME25, HMMT-24-25, and HLE-Math, outperforming other models in its class.
  • AIMO-2 Competition Winner: A version of this model secured first place in the AIMO-2 Kaggle competition, highlighting its effectiveness in complex mathematical problem-solving.
  • Multiple Inference Modes: Supports Chain-of-Thought (CoT), Tool-Integrated Reasoning (TIR), and Generative Solution Selection (GenSelect) for flexible problem-solving approaches.
  • Extended Context Length: Capable of processing inputs and generating outputs with a context length of up to 131,072 tokens.
  • Open-Sourced Pipeline: The data and model pipeline, including code, models, and dataset, are fully open-sourced for reproducibility and further research.

Good for

  • Mathematical Problem Solving: Ideal for applications requiring high-accuracy solutions to complex math problems.
  • Research in Mathematical AI: Provides a strong baseline and open-source tools for researchers exploring mathematical reasoning in LLMs.
  • Competitive Programming: Demonstrated success in mathematical competitions like AIMO-2.

Limitations

  • Domain Specificity: This model has not been instruction-tuned on general data and may not perform well outside the mathematical domain.