AsphaltProAT/deepseek_r1_distilled_qwen_7B_sparse_50

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:May 19, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

The AsphaltProAT/deepseek_r1_distilled_qwen_7B_sparse_50 model is a 7.6 billion parameter language model, distilled from DeepSeek-R1-Distill-Qwen-7B and pruned to 42.95% unstructured sparsity. Developed by AsphaltProAT, this model demonstrates that multi-step reasoning quality can be preserved even after significant weight pruning. It is primarily a proof of concept for PE-MoE architectures, showcasing the viability of sparse models for reasoning tasks, particularly in solving word problems with step-by-step explanations.

Loading preview...

Model Overview

AsphaltProAT/deepseek_r1_distilled_qwen_7B_sparse_50 is a 7.6 billion parameter model derived from deepseek-ai/DeepSeek-R1-Distill-Qwen-7B. This model serves as a proof of concept for PE-MoE architectures, specifically demonstrating the preservation of reasoning quality after significant unstructured pruning.

Key Characteristics

  • Base Model: DeepSeek-R1-Distill-Qwen-7B.
  • Sparsity Method: Unstructured pruning using SparseGPT, targeting 50% sparsity.
  • Achieved Sparsity: 42.95% actual sparsity, with weights pruned based on calibration data from GSM8K math problems (128 samples).
  • Reasoning Preservation: The model retains its ability to perform multi-step reasoning, successfully solving word problems and providing step-by-step explanations.
  • Hardware: Developed using a Kaggle T4 GPU.

Limitations and Considerations

  • Unstructured Sparsity: Requires sparse-aware inference engines to fully realize memory and computational benefits.
  • Calibration Data: Calibration was performed on general math problems, not domain-specific data.
  • Quantization: The model is not yet quantized; an AWQ step has not been applied.
  • Sparsity Variation: The achieved sparsity of 42.95% differs slightly from the 50% target due to layer-wise variations during pruning.
  • Evaluation Scope: Quality was tested primarily on simple math problems, not comprehensive benchmarks, indicating a focused proof-of-concept evaluation.