Alibaba-Apsara/DASD-4B-Thinking

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Dec 25, 2025License:apache-2.0Architecture:Transformer0.2K Open Weights Warm

DASD-4B-Thinking is a 4 billion parameter dense language model developed by Alibaba-Apsara, specialized in long chain-of-thought (Long-CoT) reasoning across mathematics, code generation, and scientific reasoning. Post-trained from Qwen3-4B-Instruct-2507 and distilled from gpt-oss-120b using a novel distribution-aligned sequence distillation pipeline, it achieves strong reasoning performance with only 448K training samples. This model excels in complex reasoning tasks and can run on consumer hardware, offering a data-efficient solution for advanced analytical applications.

Loading preview...

DASD-4B-Thinking: A Compact Model for Long-CoT Reasoning

DASD-4B-Thinking, developed by Alibaba-Apsara, is a 4 billion parameter dense language model engineered for long chain-of-thought (Long-CoT) reasoning in mathematics, code generation, and scientific domains. It is post-trained from Qwen3-4B-Instruct-2507 and distilled from gpt-oss-120b using a unique distribution-aligned sequence distillation pipeline.

Key Capabilities

  • Specialized Long-CoT Reasoning: Excels in complex, multi-step reasoning tasks across various technical fields.
  • Extreme Data Efficiency: Achieves strong performance with only 448K training samples, significantly fewer than many larger models.
  • Novel Distillation Pipeline: Introduces a new paradigm of Distribution-Aligned Sequence Distillation, incorporating Temperature-scheduled Learning, Divergence-aware Sampling, and Mixed-policy Distillation.
  • Open-Source Data: The training datasets, including "Superior-Reasoning-SFT-gpt-oss-120b", are open-sourced to enable reproducibility and community contributions.
  • Competitive Benchmarks: Outperforms many larger open-source models in benchmarks like AIME24, AIME25, LiveCodeBench, and GPQA-D.

Good for

  • Applications requiring robust mathematical and scientific reasoning.
  • Code generation tasks demanding logical thought processes.
  • Deploying advanced reasoning capabilities on consumer-grade hardware due to its compact size.
  • Researchers interested in data-efficient distillation methods and scalable reasoning models.

Limitations

Currently, DASD-4B-Thinking operates strictly within the text space and lacks tool integration and function calling capabilities, limiting its utility in agent-based workflows. Future iterations aim to address this by integrating features like knowledge retrieval and tool invocation.