Edith67677/Phi-4-mini-instruct

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.8BQuant:BF16Ctx Length:32kPublished:Apr 2, 2026License:mitArchitecture:Transformer Open Weights Warm

Phi-4-mini-instruct is a 3.8 billion parameter instruction-tuned decoder-only Transformer model developed by Microsoft. It is built upon synthetic data and filtered public websites, focusing on high-quality, reasoning-dense data. The model supports a 128K token context length and excels in memory/compute constrained environments, latency-bound scenarios, and strong reasoning, particularly in math and logic.

Loading preview...

Model Overview

Phi-4-mini-instruct is a 3.8 billion parameter instruction-tuned model from Microsoft's Phi-4 family, designed for efficiency and strong reasoning capabilities. It features a 128K token context length and a large 200K vocabulary, enhancing multilingual support. The model was developed using synthetic and filtered high-quality public data, with a focus on reasoning-dense content, and underwent supervised fine-tuning and direct preference optimization for precise instruction adherence and safety.

Key Capabilities

  • Enhanced Reasoning: Demonstrates strong performance in math and logic, with notable scores on benchmarks like GSM8K (88.6) and MATH (64.0), often outperforming similarly sized models.
  • Multilingual Support: Features an expanded vocabulary and improved architecture for better performance across 23 supported languages, including Arabic, Chinese, English, French, German, Japanese, and Spanish.
  • Instruction Following & Function Calling: Improved post-training techniques lead to robust instruction adherence and function calling capabilities, supporting JSON-formatted tool definitions.
  • Efficiency: Optimized for memory/compute constrained environments and latency-bound scenarios, making it suitable for edge deployments.

Good for

  • General Purpose AI Systems: Ideal for broad commercial and research use requiring general AI capabilities.
  • Resource-Constrained Applications: Suitable for environments with limited memory or computational power.
  • Latency-Sensitive Use Cases: Designed for scenarios where quick response times are critical.
  • Reasoning-Intensive Tasks: Particularly strong in mathematical and logical problem-solving.
  • Multilingual Applications: Benefits from an expanded vocabulary and improved architecture for diverse language support.