Overview
Overview
The Intellabs/sqft-phi-3-mini-4k-40-base is a 4 billion parameter language model developed by IntelLabs. It is based on the microsoft/Phi-3-mini-4k-instruct architecture and features a significant modification: 40% sparsity applied using the Wanda pruning method. This model is specifically designed for efficient deployment and adaptation, as detailed in the research paper "SQFT: Low-cost Model Adaptation in Low-precision Sparse Foundation Models" (arXiv link).
Key Characteristics
- Source Model: Built upon Microsoft's Phi-3-mini-4k-instruct.
- Sparsity: Achieves 40% sparsity through the Wanda pruning technique, enhancing efficiency.
- Parameter Count: 4 billion parameters.
- Context Length: Supports a 4096-token context window.
- Quantization: No additional quantization applied beyond its sparse structure.
Intended Use Cases
This model is particularly well-suited for scenarios requiring:
- Resource-constrained environments: Its sparse nature allows for more efficient inference and deployment.
- Low-cost model adaptation: Designed for scenarios where adapting foundation models with limited computational resources is crucial.
- Research into sparse and efficient LLMs: Serves as a practical example for exploring the benefits of sparsity in language models.