microsoft/Phi-3-mini-4k-instruct
Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:4kPublished:Apr 22, 2024License:mitArchitecture:Transformer1.4K Open Weights Warm

The Microsoft Phi-3-Mini-4K-Instruct is a 3.8 billion parameter, lightweight, instruction-tuned causal language model developed by Microsoft. It is trained on a high-quality dataset emphasizing reasoning-dense properties and supports a 4096-token context length. This model excels in common sense, language understanding, math, code, and logical reasoning, demonstrating robust performance among models under 13 billion parameters, making it suitable for memory/compute-constrained and latency-bound environments requiring strong reasoning capabilities.

Loading preview...

Phi-3-Mini-4K-Instruct Overview

The Phi-3-Mini-4K-Instruct is a 3.8 billion parameter, instruction-tuned language model developed by Microsoft. It is part of the Phi-3 family, designed to be lightweight yet powerful, supporting a 4096-token context window. The model has undergone extensive post-training, including supervised fine-tuning and direct preference optimization, to enhance instruction following and safety.

Key Capabilities

  • Strong Reasoning: Achieves robust performance across benchmarks for common sense, language understanding, math, code, and logical reasoning, often comparable to or surpassing larger models (under 13B parameters).
  • Optimized for Constraints: Designed for environments with limited memory/compute resources and scenarios requiring low latency.
  • Improved Instruction Following: A June 2024 update significantly boosted instruction following, structured output (JSON, XML), and multi-turn conversation quality, including explicit support for the <|system|> tag.
  • Cross-Platform Support: Optimized ONNX models are available for efficient inference on CPU, GPU (including DirectML), and mobile devices.

Good For

  • General Purpose AI Systems: Ideal as a building block for generative AI features in English-language applications.
  • Resource-Constrained Deployments: Excellent for edge devices or applications where computational resources are limited.
  • Reasoning-Intensive Tasks: Particularly strong in mathematical and logical problem-solving.
  • Developers Seeking Efficiency: Offers a powerful model in a smaller package, accelerating research and development in language models.
Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p