Ujjwal-Tyagi/DeepSeek-R1-Distill-Qwen-32B
TEXT GENERATIONConcurrency Cost:2Model Size:32.8BQuant:FP8Ctx Length:32kPublished:Mar 29, 2026License:mitArchitecture:Transformer Open Weights Cold

DeepSeek-R1-Distill-Qwen-32B is a 32.8 billion parameter distilled language model developed by DeepSeek-AI, based on Qwen2.5-32B. It is fine-tuned using reasoning data generated by the larger DeepSeek-R1 model, excelling in mathematical, coding, and general reasoning tasks. This model demonstrates that powerful reasoning patterns can be effectively transferred to smaller, dense models, offering strong performance comparable to larger proprietary models.

Loading preview...