alibaba-pai/DistillQwen-ThoughtY-32B

TEXT GENERATIONConcurrency Cost:2Model Size:32BQuant:FP8Ctx Length:32kLicense:apache-2.0Architecture:Transformer0.0K Open Weights Cold

The alibaba-pai/DistillQwen-ThoughtY-32B is a 32 billion parameter causal language model from the DistillQwen-ThoughtY series, developed by Alibaba-PAI. It is specifically optimized for enhanced Chain-of-Thought (CoT) reasoning, outperforming previous versions and Qwen3 in complex mathematical, scientific, and coding tasks. This model leverages the OmniThought-0528 dataset, a 365K high-quality CoT dataset, to achieve state-of-the-art performance in reasoning-intensive applications. It is designed for use cases requiring robust step-by-step problem-solving capabilities.

Loading preview...

DistillQwen-ThoughtY-32B: Enhanced Chain-of-Thought Reasoning

DistillQwen-ThoughtY-32B is a 32 billion parameter model developed by Alibaba-PAI, part of the DistillQwen-ThoughtY series, specifically engineered for advanced Chain-of-Thought (CoT) reasoning. This model significantly improves upon prior versions (ThoughtX) and Qwen3 in its 'thinking mode' capabilities.

Key Capabilities & Differentiators

  • Superior Reasoning Performance: Achieves state-of-the-art results across mathematical, scientific, and coding benchmarks. For instance, DistillQwen-ThoughtY-32B scores 90.0 on AIME2024 and 95.2 on MATH500, demonstrating strong analytical and problem-solving skills.
  • OmniThought-0528 Dataset: Trained using a novel 365K high-quality CoT dataset, distilled from top-tier models like DeepSeek-R1-0528 and QwQ-32B. This dataset includes unique Cognitive Difficulty (CD) and Reasoning Verbosity (RV) annotations, contributing to the model's enhanced reasoning.
  • Optimized for Complex Tasks: Designed to excel in scenarios requiring detailed, step-by-step reasoning, making it suitable for applications that demand more than direct answers.

When to Use This Model

  • Mathematical Problem Solving: Ideal for tasks involving complex equations, proofs, and quantitative analysis.
  • Scientific Inquiry: Useful for applications in scientific research, data interpretation, and hypothesis generation.
  • Code Generation & Debugging: Strong performance in coding tasks, suggesting utility for developers needing assistance with logical code structures.
  • Educational Tools: Can be integrated into systems that require explaining solutions or demonstrating thought processes.