IntelLabs/sqft-phi-3-mini-4k-50-base

Warm
Public
4B
BF16
4096
License: apache-2.0
Hugging Face
Overview

Model Overview

The IntelLabs/sqft-phi-3-mini-4k-50-base is a 4 billion parameter language model developed by IntelLabs. It is based on the microsoft/Phi-3-mini-4k-instruct model, with a key modification: 50% sparsity applied using the Wanda pruning method. This model is part of ongoing research into low-cost model adaptation within low-precision sparse foundation models, as detailed in the associated research papers.

Key Characteristics

  • Base Model: Derived from microsoft/Phi-3-mini-4k-instruct.
  • Sparsity: Achieves 50% sparsity using the Wanda method, aiming for efficiency.
  • Quantization: Currently, this base model does not incorporate quantization.
  • Context Length: Supports a 4096-token context window.
  • Research Focus: Primarily developed for research in hardware-aware automated machine learning and efficient model deployment.

Good for

  • Research in Model Compression: Ideal for researchers exploring sparse model architectures and their adaptation.
  • Efficient Inference Studies: Suitable for investigating the performance and efficiency benefits of highly sparse models.
  • Hardware-Aware ML Development: A foundational model for projects focused on optimizing LLMs for specific hardware constraints.