Neura-Tech-AI/DeepSeek-R1-Distill-Qwen-32B
TEXT GENERATIONConcurrency Cost:2Model Size:32.8BQuant:FP8Ctx Length:32kPublished:Mar 29, 2026License:mitArchitecture:Transformer Open Weights Cold

Neura-Tech-AI/DeepSeek-R1-Distill-Qwen-32B is a 32.8 billion parameter distilled language model developed by DeepSeek-AI, based on the Qwen2.5 architecture with a 32768 token context length. It is fine-tuned using reasoning data generated by the larger DeepSeek-R1 model, demonstrating enhanced performance on math, code, and reasoning benchmarks. This model is optimized for achieving strong reasoning capabilities in a smaller, dense model footprint.

Loading preview...