Name: wang7776/Mistral-7B-Instruct-v0.2-sparsity-30-v0.1 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: wang7776

Overview

This model, wang7776/Mistral-7B-Instruct-v0.2-sparsity-30-v0.1, is a 7 billion parameter instruction-tuned variant of the Mistral-7B-Instruct-v0.2 model. Its key differentiator is the application of the Wanda pruning method, achieving 30% sparsity without requiring additional retraining or weight updates. This approach aims to deliver competitive performance while potentially offering benefits in terms of model size or inference efficiency.

Key Characteristics

Sparsity: Pruned to 30% sparsity using the Wanda method.
Base Model: Built upon the Mistral-7B-Instruct-v0.2, an improved instruction-tuned version of Mistral-7B-Instruct-v0.1.
Architecture: Incorporates Grouped-Query Attention, Sliding-Window Attention, and a Byte-fallback BPE tokenizer.
Instruction Format: Designed to follow instructions enclosed by [INST] and [/INST] tokens, with specific formatting for multi-turn conversations.

Use Cases

This model is suitable for various instruction-following applications where the base Mistral-7B-Instruct-v0.2 would be used, with the added benefit of reduced sparsity. It can be integrated using the apply_chat_template() method from the Hugging Face Transformers library for structured instruction prompting. The original Mistral AI team developed the base model, emphasizing its fine-tuning capabilities for compelling performance.