Name: AsphaltProAT/deepseek_r1_distilled_qwen_7B_sparse_50 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: AsphaltProAT

Model Overview

AsphaltProAT/deepseek_r1_distilled_qwen_7B_sparse_50 is a 7.6 billion parameter model derived from deepseek-ai/DeepSeek-R1-Distill-Qwen-7B. This model serves as a proof of concept for PE-MoE architectures, specifically demonstrating the preservation of reasoning quality after significant unstructured pruning.

Key Characteristics

Base Model: DeepSeek-R1-Distill-Qwen-7B.
Sparsity Method: Unstructured pruning using SparseGPT, targeting 50% sparsity.
Achieved Sparsity: 42.95% actual sparsity, with weights pruned based on calibration data from GSM8K math problems (128 samples).
Reasoning Preservation: The model retains its ability to perform multi-step reasoning, successfully solving word problems and providing step-by-step explanations.
Hardware: Developed using a Kaggle T4 GPU.

Limitations and Considerations

Unstructured Sparsity: Requires sparse-aware inference engines to fully realize memory and computational benefits.
Calibration Data: Calibration was performed on general math problems, not domain-specific data.
Quantization: The model is not yet quantized; an AWQ step has not been applied.
Sparsity Variation: The achieved sparsity of 42.95% differs slightly from the 50% target due to layer-wise variations during pruning.
Evaluation Scope: Quality was tested primarily on simple math problems, not comprehensive benchmarks, indicating a focused proof-of-concept evaluation.

Overview

Model Overview

Key Characteristics

Limitations and Considerations

Full Model Card (README)