LiamCarter/icl-pruning-wanda-sparsity-0.4
LiamCarter/icl-pruning-wanda-sparsity-0.4 is a 7 billion parameter language model based on the Llama-2-7b-hf architecture, developed by LiamCarter. This model has been pruned using the Wanda method with a sparsity of 0.4, indicating a focus on efficient inference through model compression. It is designed for applications requiring a smaller, more optimized model while retaining capabilities derived from its Llama-2 base.
Loading preview...
Overview
This model, LiamCarter/icl-pruning-wanda-sparsity-0.4, is a 7 billion parameter variant derived from the meta-llama/Llama-2-7b-hf base model. It has undergone pruning using the Wanda method with a sparsity level of 0.4. This process aims to reduce the model's size and computational requirements while maintaining performance.
Key Characteristics
- Base Model:
meta-llama/Llama-2-7b-hf - Pruning Method: Wanda
- Sparsity: 0.4 (meaning 40% of the weights have been removed or set to zero)
- Format: Standard Hugging Face
transformers-checkpoint
Potential Use Cases
This pruned model is particularly suitable for scenarios where:
- Efficient Inference: Reduced model size and computational load are critical.
- Resource-Constrained Environments: Deployment on devices with limited memory or processing power.
- Exploration of Pruning Techniques: Researchers and developers interested in the impact of Wanda pruning on Llama-2 models.