Overview
Overview
Microsoft's Phi-4-mini-reasoning is a 3.8 billion parameter model from the Phi-4 family, specifically designed for advanced mathematical reasoning. It features a 128K token context length and is built upon synthetic, high-quality, reasoning-dense data, further fine-tuned for enhanced math capabilities. The model's architecture includes a 200K vocabulary, grouped-query attention, and shared input/output embedding, similar to Phi-4-Mini.
Key Capabilities
- Multi-step Mathematical Problem-Solving: Excels at complex math problems, formal proof generation, symbolic computation, and advanced word problems.
- Efficiency: Optimized for memory/compute constrained environments and latency-bound scenarios, making it suitable for edge or mobile deployment.
- Knowledge Distillation: Fine-tuned using synthetic math data generated by a more capable model (Deepseek-R1), comprising over one million diverse math problems with verified solutions.
- Performance: Achieves competitive scores on reasoning benchmarks like AIME (57.5), MATH-500 (94.6), and GPQA Diamond (52.0), often outperforming larger models in its class.
Good For
- Mathematical Reasoning Applications: Ideal for tasks requiring deep analytical thinking and structured logic.
- Educational Tools: Potentially suitable for embedded tutoring and other educational applications.
- Resource-Constrained Deployments: Designed for scenarios where computing power or latency is limited.
Limitations
- Primarily designed and tested for math reasoning; not evaluated for all downstream purposes.
- Limited capacity for factual knowledge due to its size, which may lead to factual incorrectness (can be mitigated with RAG).
- Performance disparities exist across non-English languages and less represented English varieties.