Neura-Tech-AI/DeepSeek-R1-Distill-Qwen-14B
TEXT GENERATIONConcurrency Cost:1Model Size:14.8BQuant:FP8Ctx Length:32kPublished:Mar 29, 2026License:mitArchitecture:Transformer Open Weights Cold

DeepSeek-R1-Distill-Qwen-14B is a 14.8 billion parameter distilled language model developed by DeepSeek AI, based on Qwen2.5-14B. It is fine-tuned using reasoning data generated by the larger DeepSeek-R1 model, which was trained via large-scale reinforcement learning. This model excels in reasoning, mathematical, and coding tasks, demonstrating strong performance on benchmarks like AIME 2024 and MATH-500. It is designed to bring advanced reasoning capabilities to a smaller, more efficient model size.

Loading preview...