Name: IntelLabs/sqft-phi-3-mini-4k-40-base API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: IntelLabs

Overview

The Intellabs/sqft-phi-3-mini-4k-40-base is a 4 billion parameter language model developed by IntelLabs. It is based on the microsoft/Phi-3-mini-4k-instruct architecture and features a significant modification: 40% sparsity applied using the Wanda pruning method. This model is specifically designed for efficient deployment and adaptation, as detailed in the research paper "SQFT: Low-cost Model Adaptation in Low-precision Sparse Foundation Models" (arXiv link).

Key Characteristics

Source Model: Built upon Microsoft's Phi-3-mini-4k-instruct.
Sparsity: Achieves 40% sparsity through the Wanda pruning technique, enhancing efficiency.
Parameter Count: 4 billion parameters.
Context Length: Supports a 4096-token context window.
Quantization: No additional quantization applied beyond its sparse structure.

Intended Use Cases

This model is particularly well-suited for scenarios requiring:

Resource-constrained environments: Its sparse nature allows for more efficient inference and deployment.
Low-cost model adaptation: Designed for scenarios where adapting foundation models with limited computational resources is crucial.
Research into sparse and efficient LLMs: Serves as a practical example for exploring the benefits of sparsity in language models.