microsoft/Phi-3-medium-4k-instruct

TEXT GENERATIONConcurrency Cost:1Model Size:14.7BQuant:FP8Ctx Length:32kPublished:May 7, 2024License:mitArchitecture:Transformer0.2K Open Weights Cold

The Phi-3-Medium-4K-Instruct is a 14.7 billion parameter, lightweight, instruction-tuned causal language model developed by Microsoft. Trained on a high-quality, reasoning-dense dataset, it features a 4K token context length and excels in common sense, language understanding, math, code, and logical reasoning tasks. This model is optimized for strong reasoning capabilities in memory/compute-constrained and latency-bound environments.

Loading preview...

Overview

Microsoft's Phi-3-Medium-4K-Instruct is a 14.7 billion parameter, instruction-tuned model from the Phi-3 family, designed for robust performance in resource-constrained settings. It was trained on a unique dataset combining synthetic data and filtered public web content, emphasizing quality and reasoning density. The model underwent supervised fine-tuning and direct preference optimization for instruction following and safety.

Key Capabilities

  • Strong Reasoning: Achieves state-of-the-art performance among same-sized and next-size-up models across benchmarks for common sense, language understanding, math, code, and logical reasoning.
  • Optimized for Efficiency: Designed for memory/compute-constrained environments and latency-bound scenarios.
  • Instruction Following: Fine-tuned with SFT and DPO for effective instruction adherence and safety.
  • 4K Context Length: Supports a 4,096 token context window.

Good For

  • General-purpose AI systems and applications requiring strong reasoning in limited resource environments.
  • Accelerating research in language models.
  • Use cases where latency is a critical factor.