unsloth/Phi-4-mini-instruct
Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.8BQuant:BF16Ctx Length:32kPublished:Feb 27, 2025License:mitArchitecture:Transformer0.0K Open Weights Warm

The unsloth/Phi-4-mini-instruct is a 3.8 billion parameter, decoder-only Transformer model developed by Microsoft, enhanced with Unsloth's bug fixes and optimized for efficient fine-tuning. It features a 131072-token context length and a 200K vocabulary, excelling in reasoning tasks, particularly math and logic, and is designed for memory/compute-constrained and latency-bound environments. This model is built upon synthetic and filtered public data, focusing on high-quality, reasoning-dense content, and supports broad multilingual commercial and research use.

Loading preview...

Overview

unsloth/Phi-4-mini-instruct is a 3.8 billion parameter instruction-tuned model from the Phi-4 family, developed by Microsoft and further refined by Unsloth with critical bug fixes. It boasts a substantial 131072-token context length and an expanded 200K vocabulary, supporting broad multilingual capabilities. The model is built on a foundation of 5 trillion tokens, combining filtered public data with newly created synthetic, "textbook-like" data specifically designed to enhance reasoning in math, coding, and common sense.

Key Capabilities

  • Strong Reasoning: Excels in mathematical and logical reasoning, as evidenced by high scores on benchmarks like GSM8K (88.6) and MATH (64.0).
  • Multilingual Support: Features a larger vocabulary and improved architecture for multilingual understanding, covering 22 languages.
  • Efficiency: Designed for memory/compute-constrained and latency-bound scenarios, making it suitable for edge deployments.
  • Instruction Adherence: Underwent supervised fine-tuning and direct preference optimization for precise instruction following and robust safety.
  • Function Calling: Supports tool-enabled function calling with a specific input format.

Good For

  • Resource-constrained environments: Ideal for applications where memory or computational power is limited.
  • Latency-sensitive applications: Provides fast inference for scenarios requiring quick responses.
  • Reasoning-heavy tasks: Particularly strong in math and logic problems.
  • Multilingual applications: Benefits from an expanded vocabulary and training for diverse languages.
  • Research and development: Serves as a building block for generative AI features and accelerating research in language models.