Name: MilyaShams/Qwen3-1.7B-Wanda_unstruct_0.4 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: MilyaShams

Overview

This model, MilyaShams/Qwen3-1.7B-Wanda_unstruct_0.4, is a compressed version of the Qwen/Qwen3-1.7B base model. It was created using the llmcompressor framework, specifically employing the Wanda_unstruct_0.4 experiment recipe.

Compression Details

The compression process involved applying a 40% unstructured sparsity to the model's linear layers. This technique aims to reduce the model's size and computational requirements by removing a significant portion of its parameters without a predefined structure, potentially making it more efficient for deployment in resource-constrained environments.

Key Characteristics

Base Model: Qwen3-1.7B, indicating its foundational architecture and initial capabilities.
Parameter Count: Approximately 1.7 billion parameters, making it a relatively compact model.
Compression Method: Unstructured pruning with a 40% sparsity level, applied to Linear layers within the Qwen3DecoderLayer.
Framework: Compressed using llmcompressor, a framework designed for model optimization.

Potential Use Cases

This compressed model is suitable for scenarios where:

Resource efficiency is critical, such as edge devices or applications with strict memory/compute budgets.
Faster inference is desired due to the reduced parameter count.
Leveraging the capabilities of the Qwen3 architecture in a more lightweight package is beneficial.

Overview

Overview

Compression Details

Key Characteristics

Potential Use Cases

Full Model Card (README)