3DJ77/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16
NVIDIA-Nemotron-3-Nano-30B-A3B-BF16 is a 30 billion parameter large language model developed by NVIDIA, featuring a hybrid Mixture-of-Experts (MoE) architecture with Mamba-2 and Attention layers. Designed for both reasoning and non-reasoning tasks, it can generate explicit reasoning traces for higher accuracy on complex prompts. This model supports a 1M token context length and is optimized for agentic systems, chatbots, and RAG applications across English and several other languages.
Loading preview...
Model Overview
NVIDIA-Nemotron-3-Nano-30B-A3B-BF16 is a 30 billion parameter large language model (LLM) developed by NVIDIA, featuring a unique hybrid Mixture-of-Experts (MoE) architecture. It combines 23 Mamba-2 and MoE layers with 6 Attention layers, activating 6 out of 128 experts plus 1 shared expert per token, resulting in 3.5 billion active parameters. The model is designed for both reasoning and non-reasoning tasks, capable of generating explicit reasoning traces to improve accuracy on challenging prompts, a feature configurable via the chat template.
Key Capabilities
- Advanced Reasoning: Can generate step-by-step reasoning traces for complex problems, enhancing solution quality.
- Hybrid MoE Architecture: Leverages a Mamba-2 and Transformer hybrid MoE design for efficiency and performance.
- Extensive Context Window: Supports an impressive 1 million token context length, suitable for long-document analysis.
- Multilingual Support: Supports English, German, Spanish, French, Italian, and Japanese, with improved performance using Qwen.
- Commercial Use Ready: Licensed for commercial applications.
- Comprehensive Training: Trained on 25 trillion tokens, including a significant portion of synthetic data across code, math, science, and general knowledge.
Good For
- AI Agent Systems: Ideal for developers building sophisticated AI agents that require robust reasoning capabilities.
- Chatbots and Conversational AI: Suitable for creating high-quality, instruction-following chatbots.
- RAG Systems: Effective for Retrieval-Augmented Generation applications due to its long context handling.
- Instruction Following: Excels at general instruction-following tasks, with configurable reasoning behavior.