LiamCarter/icl-pruning-wanda-sparsity-0.3

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Apr 23, 2026Architecture:Transformer Cold

LiamCarter/icl-pruning-wanda-sparsity-0.3 is a 7 billion parameter language model derived from Meta's Llama-2-7b-hf, utilizing the Wanda pruning method with a 0.3 sparsity variant. This model focuses on exploring efficient model architectures through structured sparsity. It is intended for research and experimentation in model compression and sparse neural networks.

Loading preview...

Model Overview

This repository hosts LiamCarter/icl-pruning-wanda-sparsity-0.3, a 7 billion parameter language model based on meta-llama/Llama-2-7b-hf. The model has been modified using the Wanda pruning method with a 0.3 sparsity variant, indicating a significant reduction in parameters while aiming to maintain performance.

Key Characteristics

  • Base Model: meta-llama/Llama-2-7b-hf
  • Pruning Method: Wanda
  • Sparsity: 0.3 (meaning 30% of the weights have been pruned)
  • Format: Standard Hugging Face transformers-checkpoint

Purpose and Use Cases

This model is primarily an experimental artifact, preserving the original local files from a research project. It is particularly relevant for:

  • Research in Model Compression: Investigating the effectiveness of the Wanda pruning technique.
  • Sparse Neural Networks: Studying the behavior and performance of models with high sparsity.
  • Efficiency Studies: Exploring methods to reduce model size and computational requirements.

Users should note that while some directories are standard Hugging Face checkpoints, others are experiment bundles that may require custom loading code for full utilization.