microsoft/Phi-3-medium-4k-instruct
The Phi-3-Medium-4K-Instruct is a 14.7 billion parameter, lightweight, instruction-tuned causal language model developed by Microsoft. Trained on a high-quality, reasoning-dense dataset, it features a 4K token context length and excels in common sense, language understanding, math, code, and logical reasoning tasks. This model is optimized for strong reasoning capabilities in memory/compute-constrained and latency-bound environments.
Loading preview...
Overview
Microsoft's Phi-3-Medium-4K-Instruct is a 14.7 billion parameter, instruction-tuned model from the Phi-3 family, designed for robust performance in resource-constrained settings. It was trained on a unique dataset combining synthetic data and filtered public web content, emphasizing quality and reasoning density. The model underwent supervised fine-tuning and direct preference optimization for instruction following and safety.
Key Capabilities
- Strong Reasoning: Achieves state-of-the-art performance among same-sized and next-size-up models across benchmarks for common sense, language understanding, math, code, and logical reasoning.
- Optimized for Efficiency: Designed for memory/compute-constrained environments and latency-bound scenarios.
- Instruction Following: Fine-tuned with SFT and DPO for effective instruction adherence and safety.
- 4K Context Length: Supports a 4,096 token context window.
Good For
- General-purpose AI systems and applications requiring strong reasoning in limited resource environments.
- Accelerating research in language models.
- Use cases where latency is a critical factor.