unsloth/Phi-3.5-mini-instruct

TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:4kPublished:Aug 20, 2024License:mitArchitecture:Transformer0.0K Open Weights Cold

The unsloth/Phi-3.5-mini-instruct is a 3.8 billion parameter instruction-tuned decoder-only Transformer model developed by Microsoft AI and the Phi team. It supports a 128K token context length and is optimized for strong reasoning capabilities, particularly in code, math, and logic. This model excels in memory/compute constrained environments and latency-bound scenarios, offering competitive multilingual performance.

Loading preview...

unsloth/Phi-3.5-mini-instruct: Optimized for Reasoning and Multilingual Tasks

This model is a 3.8 billion parameter instruction-tuned variant of the Phi-3.5 family, developed by Microsoft AI. It leverages synthetic data and filtered public datasets, with a strong focus on high-quality, reasoning-dense information. The model supports an extensive 128K token context length, making it suitable for long document summarization and complex QA tasks.

Key Capabilities

  • Strong Reasoning: Excels in code, math, and logic, achieving high scores on benchmarks like GSM8K and MATH.
  • Multilingual Performance: Demonstrates competitive performance across various languages on benchmarks such as Multilingual MMLU and MGSM, despite its compact size.
  • Long Context Understanding: Capable of handling 128K token contexts, outperforming some larger models in tasks like GovReport and QMSum.
  • Efficient Fine-tuning: Unsloth provides tools for 2x faster fine-tuning with 50% less memory usage.

Good For

  • Applications requiring strong reasoning in resource-constrained environments.
  • Use cases demanding long context processing, such as document analysis and summarization.
  • Multilingual applications where a compact yet capable model is needed.
  • As a building block for generative AI features, especially when augmented with RAG for factual knowledge.