nvidia/NFT-32B

Warm
Public
32.8B
FP8
131072
Jun 17, 2025
License: nvidia-non-commercial-license
Hugging Face
Overview

Model Overview

NFT-32B is a 32.5 billion parameter mathematical reasoning model developed by NVIDIA, Tsinghua University, and Stanford University. It is fine-tuned from Qwen2.5-32B using the innovative Negative-aware Fine-Tuning (NFT) algorithm. Unlike traditional supervised learning, NFT explicitly models and learns from incorrect answers, enabling the model to reflect on its failures and improve autonomously without external teachers. This approach achieves performance comparable to leading reinforcement learning methods while maintaining the efficiency of supervised learning.

Key Capabilities

  • Advanced Mathematical Reasoning: Specifically designed and optimized for complex mathematical problems, including competition-level (AIME, AMC, Olympiad) and general reasoning benchmarks (MATH500, Minerva Math).
  • Failure-Aware Learning: Utilizes an implicit negative policy to learn from incorrect generations, leading to robust performance improvements.
  • High Performance: Achieves significant improvements over its base model, Qwen2.5-32B, across various math benchmarks, with an average performance increase of 29.6%.
  • Extended Context Window: Supports a context length of up to 131,072 tokens, suitable for multi-step and complex problems.
  • LaTeX Support: Handles mathematical expressions using LaTeX notation for both input and output.

Good For

  • Mathematical Problem Solving: Generating step-by-step solutions for a wide range of mathematical challenges.
  • Research and Development: Exploring advanced supervised learning techniques for specialized tasks.
  • Educational Tools: Assisting with complex math problems where detailed reasoning is required.

Limitations

  • Domain Specificity: Not recommended for general conversation or non-mathematical tasks.
  • Calculation Errors: May still exhibit arithmetic errors in highly complex calculations.
  • Resource Intensive: The 32B model requires substantial GPU memory for inference.