Lexius/Phi-3.5-mini-instruct

TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:4kPublished:Jun 2, 2025License:mitArchitecture:Transformer0.0K Open Weights Cold

Lexius/Phi-3.5-mini-instruct is a 3.8 billion parameter instruction-tuned decoder-only Transformer model developed by Microsoft, belonging to the Phi-3 family. It is optimized for strong reasoning capabilities, particularly in code, math, and logic, and supports a 128K token context length. This model excels in memory/compute constrained environments and latency-bound scenarios, offering enhanced multilingual and multi-turn conversation quality compared to its predecessor. It is designed for general-purpose AI systems requiring efficient performance and robust reasoning across various languages.

Loading preview...

Overview

Lexius/Phi-3.5-mini-instruct is a 3.8 billion parameter instruction-tuned model from Microsoft's Phi-3 family, updated in August 2024. It builds upon the Phi-3 architecture, utilizing high-quality synthetic and filtered public datasets with a strong focus on reasoning-dense data. The model underwent rigorous enhancement through supervised fine-tuning, proximal policy optimization, and direct preference optimization to improve instruction adherence and safety. It supports an extensive 128K token context length, making it suitable for long document summarization, QA, and information retrieval tasks.

Key Capabilities

  • Strong Reasoning: Excels in code, math, and logic tasks, demonstrating competitive performance against larger models.
  • Multilingual Proficiency: Shows substantial gains in multilingual MMLU, MEGA, and multi-turn conversation quality across various languages, including Arabic, Chinese, French, German, and Spanish.
  • Long Context Handling: Capable of processing and understanding long contexts up to 128K tokens, outperforming some models with larger parameter counts in long-context benchmarks like GovReport and Qasper.
  • Efficient Performance: Designed for memory/compute-constrained environments and latency-bound scenarios, making it a lightweight yet powerful option.

Good For

  • General-purpose AI systems: Ideal for applications requiring robust language understanding and generation.
  • Research and Development: Serves as a strong building block for generative AI features and accelerating research in language models.
  • Resource-constrained deployments: Suitable for environments where computational resources or latency are critical factors.
  • Multilingual applications: Offers improved performance in non-English languages, making it versatile for global use cases.