Name: MilyaShams/Qwen3-1.7B-Wanda_1_4 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: MilyaShams

Model Overview

MilyaShams/Qwen3-1.7B-Wanda_1_4 is a compressed version of the Qwen/Qwen3-1.7B language model, featuring approximately 1.7 billion parameters. This model was processed using the llmcompressor framework, which aims to reduce model size and computational requirements while preserving performance.

Compression Details

The compression process, identified as the Wanda_1_4 experiment, involved applying specific modifiers to the base Qwen3-1.7B model. Key aspects of this compression include:

Sparsity: A sparsity level of 0.25 was applied.
Mask Structure: The compression utilized a 1:4 mask structure, indicating a specific pattern of weight pruning or quantization.
Target Layers: The compression primarily targeted Linear layers within the model, with sequential updates applied to Qwen3DecoderLayer components.

Use Cases

This compressed model is particularly suitable for scenarios where computational resources are constrained, such as edge devices or applications requiring faster inference times. Its reduced footprint makes it an efficient choice for tasks that benefit from a smaller, optimized language model derived from the Qwen3 architecture.

Overview

Model Overview

Compression Details

Use Cases

Full Model Card (README)