Overview
Model Overview
The IntelLabs/sqft-phi-3-mini-4k-50-base is a 4 billion parameter language model developed by IntelLabs. It is based on the microsoft/Phi-3-mini-4k-instruct model, with a key modification: 50% sparsity applied using the Wanda pruning method. This model is part of ongoing research into low-cost model adaptation within low-precision sparse foundation models, as detailed in the associated research papers.
Key Characteristics
- Base Model: Derived from
microsoft/Phi-3-mini-4k-instruct. - Sparsity: Achieves 50% sparsity using the Wanda method, aiming for efficiency.
- Quantization: Currently, this base model does not incorporate quantization.
- Context Length: Supports a 4096-token context window.
- Research Focus: Primarily developed for research in hardware-aware automated machine learning and efficient model deployment.
Good for
- Research in Model Compression: Ideal for researchers exploring sparse model architectures and their adaptation.
- Efficient Inference Studies: Suitable for investigating the performance and efficiency benefits of highly sparse models.
- Hardware-Aware ML Development: A foundational model for projects focused on optimizing LLMs for specific hardware constraints.