MilyaShams/Qwen3-1.7B-Wanda_unstruct_0.6
MilyaShams/Qwen3-1.7B-Wanda_unstruct_0.6 is a 1.7 billion parameter language model based on the Qwen3 architecture, compressed using the llmcompressor framework. This model features a 32768-token context length and has undergone unstructured sparsity compression at a 0.6 ratio using the Wanda method. It is optimized for efficient deployment in scenarios where reduced model size and computational cost are critical.
Loading preview...
Model Overview
MilyaShams/Qwen3-1.7B-Wanda_unstruct_0.6 is a compressed version of the Qwen/Qwen3-1.7B base model, developed by MilyaShams. This model has been optimized for efficiency through the application of the llmcompressor framework, specifically utilizing the Wanda unstructured sparsity method with a 0.6 sparsity ratio. This compression technique aims to reduce the model's size and computational requirements while retaining performance.
Key Characteristics
- Base Architecture: Qwen3-1.7B
- Parameter Count: Approximately 1.7 billion parameters
- Context Length: Supports a substantial 32768-token context window
- Compression Method: Wanda unstructured sparsity with a 0.6 ratio, applied to
Linearlayers withinQwen3DecoderLayertargets.
Use Cases
This model is particularly well-suited for applications requiring a powerful language model with a smaller footprint and faster inference. Its compression makes it ideal for:
- Deployment on edge devices or environments with limited computational resources.
- Scenarios where cost-effective scaling of LLM inference is a priority.
- Tasks benefiting from a large context window, such as long-form content generation or complex document analysis, within a resource-constrained setting.