MilyaShams/Qwen3-1.7B-Wanda_unstruct_0.4
MilyaShams/Qwen3-1.7B-Wanda_unstruct_0.4 is a 1.7 billion parameter language model based on the Qwen3 architecture, compressed using the llmcompressor framework. This model has undergone unstructured pruning with a 40% sparsity level, targeting linear layers for efficiency. It is designed for applications requiring a smaller, more efficient model while retaining capabilities derived from its Qwen3 base.
Loading preview...
Overview
This model, MilyaShams/Qwen3-1.7B-Wanda_unstruct_0.4, is a compressed version of the Qwen/Qwen3-1.7B base model. It was created using the llmcompressor framework, specifically employing the Wanda_unstruct_0.4 experiment recipe.
Compression Details
The compression process involved applying a 40% unstructured sparsity to the model's linear layers. This technique aims to reduce the model's size and computational requirements by removing a significant portion of its parameters without a predefined structure, potentially making it more efficient for deployment in resource-constrained environments.
Key Characteristics
- Base Model: Qwen3-1.7B, indicating its foundational architecture and initial capabilities.
- Parameter Count: Approximately 1.7 billion parameters, making it a relatively compact model.
- Compression Method: Unstructured pruning with a 40% sparsity level, applied to
Linearlayers within theQwen3DecoderLayer. - Framework: Compressed using
llmcompressor, a framework designed for model optimization.
Potential Use Cases
This compressed model is suitable for scenarios where:
- Resource efficiency is critical, such as edge devices or applications with strict memory/compute budgets.
- Faster inference is desired due to the reduced parameter count.
- Leveraging the capabilities of the Qwen3 architecture in a more lightweight package is beneficial.