Overview
Nanbeige4.1-3B Overview
Nanbeige4.1-3B is a 4.1 billion parameter model from Nanbeige, representing an advanced iteration of their Nanbeige4-3B-Base. It has been optimized through supervised fine-tuning (SFT) and reinforcement learning (RL) to achieve a unique combination of capabilities for a compact model.
Key Capabilities
- Strong Reasoning: The model can solve complex, multi-step problems, consistently producing correct answers on challenging tasks like LiveCodeBench-Pro, IMO-Answer-Bench, and AIME 2026 I.
- Robust Preference Alignment: It demonstrates solid alignment performance, surpassing same-scale models (e.g., Qwen3-4B-2507) and even larger models (e.g., Qwen3-30B-A3B) on benchmarks like Arena-Hard-v2 and Multi-Challenge.
- Agentic Capability: Nanbeige4.1-3B is notable for being the first small general model to natively support deep-search tasks, reliably handling complex problem-solving with over 500 rounds of tool invocations. This fills a significant gap in the small-model ecosystem.
Performance Highlights
On general reasoning tasks covering code, math, science, alignment, and tool-use, Nanbeige4.1-3B significantly outperforms same-scale models like Qwen3-4B and shows overall superior performance compared to larger models such as Qwen3-30B-A3B-2507 and Qwen3-32B. Its deep-search capabilities are comparable to specialized agents under 10B parameters, marking a substantial qualitative improvement over prior small general models.