unsloth/Phi-4-reasoning-plus
Phi-4-reasoning-plus is a 14.7 billion parameter decoder-only Transformer model developed by Microsoft Research, fine-tuned from Phi-4. It is optimized for advanced reasoning tasks in math, science, and coding, utilizing supervised fine-tuning on chain-of-thought traces and reinforcement learning. With a 32k token context length, this model excels at generating detailed reasoning chains followed by summarized solutions. Its primary use cases include accelerating research in language models and serving as a building block for generative AI applications requiring strong reasoning capabilities in memory/compute-constrained and latency-bound environments.
Loading preview...
What is unsloth/Phi-4-reasoning-plus?
unsloth/Phi-4-reasoning-plus is a 14.7 billion parameter language model from Microsoft Research, built upon the Phi-4 architecture. It is specifically fine-tuned for advanced reasoning tasks in mathematics, science, and coding. The model leverages supervised fine-tuning on chain-of-thought (CoT) traces and reinforcement learning to enhance its problem-solving abilities.
Key Capabilities
- Enhanced Reasoning: Excels in complex reasoning tasks, generating detailed thought processes before providing solutions.
- Specialized Training: Fine-tuned on high-quality datasets focusing on math, science, and coding skills.
- Context Length: Supports a substantial 32k token context window, with experimental support up to 64k tokens for longer reasoning sequences.
- Performance: Demonstrates strong performance on reasoning benchmarks like AIME, OmniMath, and GPQA-Diamond, often outperforming larger open-weight models.
- Structured Output: Designed to produce responses with distinct 'Thought' and 'Solution' sections, aiding clarity and analysis.
Good for
- Research & Development: Ideal for accelerating research in language models and as a foundation for generative AI features.
- Reasoning-Intensive Applications: Suited for applications requiring strong logical deduction, problem-solving, and multi-step reasoning.
- Resource-Constrained Environments: Optimized for use in memory/compute-constrained and latency-bound scenarios due to its efficient architecture.
- Educational Tools: Can be used to develop tools that explain complex concepts through detailed reasoning steps.