Frugal-Math-4B is a 4 billion parameter, reasoning-optimized variant of Qwen3-4B-Thinking-2507, developed by MBZUAI-Paris. Trained with Reinforcement Learning with Verifiable Rewards (RLVR) and a 40960 token context length, it specializes in generating concise and verifiable mathematical solutions. This model achieves significant reductions in reasoning length while maintaining or improving accuracy on complex math benchmarks, making it ideal for efficient mathematical reasoning tasks.
No reviews yet. Be the first to review!