microsoft/Phi-4-reasoning-plus
microsoft/Phi-4-reasoning-plus is a 14.7 billion parameter dense decoder-only Transformer model developed by Microsoft Research. It is fine-tuned from Phi-4 using supervised fine-tuning on chain-of-thought traces and reinforcement learning, specifically optimized for advanced reasoning in math, science, and coding. With a 32k token context length, this model excels in general-purpose AI systems requiring strong reasoning and logic in memory/compute-constrained or latency-bound environments.
Loading preview...
Model Overview
microsoft/Phi-4-reasoning-plus is a 14.7 billion parameter model developed by Microsoft Research, building upon the Phi-4 architecture. It has been significantly enhanced through supervised fine-tuning on a dataset of chain-of-thought traces and further refined with reinforcement learning. This process focused on high-quality data for math, science, and coding skills, alongside safety alignment.
Key Capabilities & Differentiators
- Advanced Reasoning: Specifically designed and optimized for complex reasoning tasks, particularly in mathematics, science, and coding, utilizing a chain-of-thought approach.
- Enhanced Accuracy: Reinforcement Learning training contributes to higher accuracy, though it may result in longer output generation.
- Context Length: Supports a substantial 32k token context length, enabling deep, multi-step reasoning over extended inputs.
- Performance: Benchmarks show strong performance on reasoning tasks like AIME, OmniMath, and GPQA-Diamond, often outperforming significantly larger open-weight models.
- Instruction Following: Demonstrates improved instruction following and general abilities compared to its predecessor, Phi-4.
Intended Use Cases
This model is ideal for accelerating research in language models and serves as a robust building block for generative AI features. It is particularly well-suited for:
- Memory/Compute Constrained Environments: Its efficient design makes it viable where resources are limited.
- Latency-Bound Scenarios: Optimized for applications where quick responses are critical.
- Reasoning and Logic-Intensive Tasks: Excels in applications requiring advanced analytical and problem-solving capabilities, primarily in English.