nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16

Hugging Face
TEXT GENERATIONConcurrency Cost:2Model Size:30BQuant:FP8Ctx Length:32kPublished:Dec 4, 2025License:otherArchitecture:Transformer0.8K Warm

NVIDIA-Nemotron-3-Nano-30B-A3B-BF16 is a 30 billion parameter hybrid Mixture-of-Experts (MoE) large language model developed by NVIDIA, featuring 3.5B active parameters and a 1M token context length. It is designed for both reasoning and non-reasoning tasks, capable of generating reasoning traces for higher accuracy in complex queries. This model is optimized for agentic reasoning, general instruction following, and chat applications, supporting English and several other languages.

Loading preview...

Model Overview

NVIDIA-Nemotron-3-Nano-30B-A3B-BF16 is a 30 billion parameter large language model developed by NVIDIA, trained from scratch with a hybrid Mixture-of-Experts (MoE) architecture. It combines 23 Mamba-2 and MoE layers with 6 Attention layers, featuring 3.5 billion active parameters and supporting a maximum context length of 1 million tokens. The model is designed to excel in both reasoning and non-reasoning tasks, capable of generating explicit reasoning traces to improve accuracy on complex prompts, a feature configurable via the chat template.

Key Capabilities

  • Agentic Reasoning: Optimized for AI agent systems, it can generate reasoning traces for higher-quality solutions.
  • Hybrid MoE Architecture: Utilizes a unique blend of Mamba-2, MoE, and Attention layers for efficient performance.
  • Multilingual Support: Supports English, German, Spanish, French, Italian, and Japanese, with extensive multilingual training data.
  • Commercial Use: Licensed for commercial applications.
  • Extensive Training: Pre-trained on 25 trillion tokens, including a significant portion of synthetic data, and fine-tuned with multi-environment reinforcement learning.

Good for

  • AI Agent Systems: Ideal for developers building sophisticated AI agents requiring robust reasoning capabilities.
  • Chatbots & RAG Systems: Suitable for general-purpose chat and instruction-following tasks, as well as Retrieval Augmented Generation (RAG) systems.
  • Instruction Following: Excels at typical instruction-following tasks, with configurable reasoning behavior.