DeepSeek-R1-Distill-Qwen-1.5B is a 1.5 billion parameter language model developed by DeepSeek-AI, distilled from the larger DeepSeek-R1 model and based on the Qwen2.5-Math-1.5B architecture. It is specifically fine-tuned using reasoning data generated by DeepSeek-R1, excelling in mathematical, coding, and general reasoning tasks. This model demonstrates that complex reasoning patterns can be effectively transferred to smaller models, making it suitable for applications requiring strong reasoning capabilities with a smaller footprint.
No reviews yet. Be the first to review!