Name: nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16 API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: nvidia

Model Overview

NVIDIA-Nemotron-3-Nano-30B-A3B-BF16 is a 30 billion parameter large language model (LLM) developed by NVIDIA, featuring a unique hybrid Mixture-of-Experts (MoE) architecture. It combines 23 Mamba-2 and MoE layers with 6 Attention layers, utilizing 3.5 billion active parameters. This model is specifically designed to excel in both reasoning and non-reasoning tasks, offering a configurable option to generate intermediate reasoning traces for improved accuracy on challenging prompts.

Key Capabilities

Advanced Reasoning: Can generate detailed reasoning traces before providing a final answer, enhancing solution quality for complex problems.
Hybrid MoE Architecture: Leverages a combination of Mamba-2 and Attention layers for efficient processing.
Multilingual Support: Supports English, German, Spanish, French, Italian, and Japanese.
Extensive Training: Pre-trained on 25 trillion tokens, including a significant amount of synthetic data across various domains like code, math, science, and general knowledge.
Long Context: Supports a maximum input and output size of 1 million tokens.

Good For

AI Agent Systems: Ideal for developers building sophisticated AI agents that require robust reasoning capabilities.
Chatbots and RAG Systems: Suitable for creating highly accurate and context-aware conversational AI and retrieval-augmented generation applications.
Instruction Following: Performs well on general instruction-following tasks.
Commercial Use: Licensed for commercial deployment.

Overview

Model Overview

Key Capabilities

Good For

Full Model Card (README)