netease-youdao/Confucius4
Confucius4 is a 27 billion parameter multimodal large language model developed by the NetEase Youdao AI Team, built upon the Qwen3.5 architecture. It is specifically designed for advanced mathematical reasoning, achieving state-of-the-art performance among comparable-scale models on visual math benchmarks. The model utilizes an iterative SFT-RL optimization paradigm and a refined, compact Chain-of-Thought approach to enhance both accuracy and efficiency in problem-solving.
Loading preview...
Confucius4: Multimodal Mathematical Reasoning
Confucius4 is a 27 billion parameter multimodal LLM developed by the NetEase Youdao AI Team, based on the Qwen3.5 architecture. It is optimized for advanced mathematical reasoning, particularly in multimodal contexts, and demonstrates strong performance on visual math benchmarks.
Key Capabilities & Features
- State-of-the-Art Performance: Achieves leading results among models of comparable scale on various visual math benchmarks, including Math-Figure, MathVision, and logicVista.
- Efficient Training: Utilizes a cost-effective multimodal training set created through image-gain filtering, combined with an iterative SFT + RL paradigm for continuous performance improvement.
- Enhanced Reasoning: Incorporates pure-text reasoning data augmentation during SFT to strengthen the model's reasoning foundation, showing a +23.2% gain on Math-Hard-500.
- Compact Chain-of-Thought (CoT): Employs refined CoT SFT to eliminate redundant steps, producing concise yet complete reasoning chains, and a Length-Aware RL mechanism to reduce CoT length by 43.2% for non-challenging problems.
- Chinese Language Optimization: Features targeted optimization on Chinese-language data, resulting in outputs aligned with Chinese linguistic and cultural preferences.
When to Use Confucius4
- Multimodal Mathematical Problem Solving: Ideal for tasks requiring reasoning over visual and textual mathematical problems.
- Efficient Reasoning: Suitable for applications where both high accuracy and concise reasoning outputs are desired.
- Chinese-centric Applications: Particularly effective for users and applications requiring strong performance and culturally aligned outputs in Chinese.