wang7776/Mistral-7B-Instruct-v0.2-sparsity-10
wang7776/Mistral-7B-Instruct-v0.2-sparsity-10 is a 7 billion parameter instruction-tuned causal language model, based on Mistral AI's Mistral-7B-Instruct-v0.2. This version has been pruned to 10% sparsity using the Wanda method, which aims to maintain competitive performance without retraining. It features Grouped-Query Attention, Sliding-Window Attention, and a Byte-fallback BPE tokenizer, making it suitable for efficient instruction-following tasks.
Loading preview...
Overview
This model, wang7776/Mistral-7B-Instruct-v0.2-sparsity-10, is a 7 billion parameter instruction-tuned language model derived from Mistral AI's Mistral-7B-Instruct-v0.2. Its key differentiator is the application of the Wanda pruning method, reducing its sparsity to 10% without requiring additional retraining or weight updates, while still aiming for competitive performance.
Key Capabilities
- Efficient Instruction Following: Built upon the Mistral-7B-Instruct-v0.2 base, it is designed to follow instructions effectively.
- Optimized Architecture: Incorporates advanced architectural features like Grouped-Query Attention and Sliding-Window Attention for improved efficiency.
- Reduced Size: The 10% sparsity can lead to a smaller model footprint and potentially faster inference compared to its dense counterpart.
When to Use This Model
- Resource-Constrained Environments: Ideal for scenarios where computational resources or memory are limited, but instruction-following capabilities are still required.
- Experimentation with Pruning: Useful for developers interested in exploring the performance of pruned models without extensive retraining.
- General Instruction-Following: Suitable for a wide range of tasks that benefit from an instruction-tuned model, leveraging its base model's capabilities.