IntelLabs/sqft-phi-3-mini-4k-50-base
Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:4kPublished:Apr 26, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

The IntelLabs/sqft-phi-3-mini-4k-50-base is a 4 billion parameter language model derived from Microsoft's Phi-3-mini-4k-instruct, featuring 50% sparsity applied using the Wanda method. Developed by IntelLabs, this model is designed for efficient deployment in low-precision sparse foundation model adaptation scenarios. It maintains a 4096-token context length and is optimized for research into hardware-aware automated machine learning.

Loading preview...

Model Overview

The IntelLabs/sqft-phi-3-mini-4k-50-base is a 4 billion parameter language model developed by IntelLabs. It is based on the microsoft/Phi-3-mini-4k-instruct model, with a key modification: 50% sparsity applied using the Wanda pruning method. This model is part of ongoing research into low-cost model adaptation within low-precision sparse foundation models, as detailed in the associated research papers.

Key Characteristics

  • Base Model: Derived from microsoft/Phi-3-mini-4k-instruct.
  • Sparsity: Achieves 50% sparsity using the Wanda method, aiming for efficiency.
  • Quantization: Currently, this base model does not incorporate quantization.
  • Context Length: Supports a 4096-token context window.
  • Research Focus: Primarily developed for research in hardware-aware automated machine learning and efficient model deployment.

Good for

  • Research in Model Compression: Ideal for researchers exploring sparse model architectures and their adaptation.
  • Efficient Inference Studies: Suitable for investigating the performance and efficiency benefits of highly sparse models.
  • Hardware-Aware ML Development: A foundational model for projects focused on optimizing LLMs for specific hardware constraints.