IntelLabs/sqft-phi-3-mini-4k-40-base

Warm
Public
4B
BF16
4096
License: apache-2.0
Hugging Face
Overview

Overview

The Intellabs/sqft-phi-3-mini-4k-40-base is a 4 billion parameter language model developed by IntelLabs. It is based on the microsoft/Phi-3-mini-4k-instruct architecture and features a significant modification: 40% sparsity applied using the Wanda pruning method. This model is specifically designed for efficient deployment and adaptation, as detailed in the research paper "SQFT: Low-cost Model Adaptation in Low-precision Sparse Foundation Models" (arXiv link).

Key Characteristics

  • Source Model: Built upon Microsoft's Phi-3-mini-4k-instruct.
  • Sparsity: Achieves 40% sparsity through the Wanda pruning technique, enhancing efficiency.
  • Parameter Count: 4 billion parameters.
  • Context Length: Supports a 4096-token context window.
  • Quantization: No additional quantization applied beyond its sparse structure.

Intended Use Cases

This model is particularly well-suited for scenarios requiring:

  • Resource-constrained environments: Its sparse nature allows for more efficient inference and deployment.
  • Low-cost model adaptation: Designed for scenarios where adapting foundation models with limited computational resources is crucial.
  • Research into sparse and efficient LLMs: Serves as a practical example for exploring the benefits of sparsity in language models.