MilyaShams/Qwen3-1.7B-Wanda_unstruct_0.5

TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kPublished:Apr 8, 2026Architecture:Transformer Cold

MilyaShams/Qwen3-1.7B-Wanda_unstruct_0.5 is a 1.7 billion parameter language model based on the Qwen3 architecture, compressed using the llmcompressor framework. This model features a 0.5 sparsity applied via the Wanda method, targeting Linear layers within the Qwen3DecoderLayer. It is optimized for efficient deployment and inference while maintaining performance characteristics of its base model, Qwen/Qwen3-1.7B.

Loading preview...

Model Overview

This model, MilyaShams/Qwen3-1.7B-Wanda_unstruct_0.5, is a compressed version of the Qwen/Qwen3-1.7B base model. It was created using the llmcompressor framework, which applies various techniques to reduce model size and improve inference efficiency.

Compression Details

The primary differentiator of this model is its compression strategy. It utilizes the Wanda_unstruct_0.5 recipe, which introduces a sparsity of 0.5. This means approximately half of the model's parameters have been pruned, specifically targeting the Linear layers within the Qwen3DecoderLayer modules. The compression aims to achieve a smaller footprint and potentially faster inference without significant degradation in performance compared to the original 1.7 billion parameter Qwen3 model.

Key Characteristics

  • Base Architecture: Qwen3-1.7B
  • Parameter Count: Approximately 1.7 billion (pre-compression, effective size reduced by sparsity)
  • Compression Method: Wanda (Weight Agnostic Neural Decoder for Activation-based pruning)
  • Sparsity: 0.5 (50% unstructured sparsity)
  • Targeted Layers: Linear layers within Qwen3DecoderLayer

Potential Use Cases

This compressed model is suitable for applications where computational resources or deployment size are critical constraints. It can be particularly useful for:

  • Edge device deployment
  • Applications requiring faster inference times
  • Scenarios where a slightly reduced performance is acceptable for significant resource savings.