microsoft/Phi-4-reasoning

TEXT GENERATIONConcurrency Cost:1Model Size:14.7BQuant:FP8Ctx Length:32kPublished:Apr 9, 2025License:mitArchitecture:Transformer0.2K Open Weights Cold

Microsoft's Phi-4-reasoning is a 14.7 billion parameter dense decoder-only Transformer model, fine-tuned from Phi-4. It is specifically optimized for advanced reasoning tasks in math, science, and coding, leveraging supervised fine-tuning on chain-of-thought traces and reinforcement learning. With a 32k token context length, it excels in scenarios requiring logical deduction and problem-solving, particularly in memory/compute-constrained and latency-bound environments.

Loading preview...

Microsoft Phi-4-reasoning: An Advanced Reasoning Model

Microsoft's Phi-4-reasoning is a 14.7 billion parameter decoder-only Transformer model, building upon the Phi-4 architecture. It has been meticulously fine-tuned using supervised learning on chain-of-thought traces and reinforcement learning, with a strong focus on high-quality data for advanced reasoning in math, science, and coding. The model is designed to produce responses with a distinct reasoning chain-of-thought block followed by a summarization block, guiding users through its problem-solving process.

Key Capabilities

  • Enhanced Reasoning: Optimized for complex logical deduction, mathematical problem-solving (e.g., AIME, OmniMath), scientific inquiry (GPQA-Diamond), and algorithmic tasks (3SAT, TSP).
  • Code Generation: Demonstrates strong performance in code generation benchmarks like LiveCodeBench and HumanEvalPlus, particularly in Python.
  • Efficiency: Designed for use in memory/compute-constrained and latency-bound environments, making it suitable for various deployment scenarios.
  • Structured Output: Generates responses with a clear thought process and a concise solution, aiding in understanding and verification.
  • Robust Safety: Incorporates extensive safety post-training via supervised fine-tuning and rigorous red-teaming evaluations to mitigate harmful content and biases.

Good For

  • Accelerating research in language models and serving as a building block for generative AI features.
  • General-purpose AI systems requiring strong reasoning and logic capabilities, primarily in English.
  • Applications where detailed, step-by-step reasoning is crucial for problem-solving.
  • Developers seeking a capable model for math, science, and coding tasks that can operate efficiently.