unsloth/DeepSeek-R1-Distill-Qwen-32B

Cold
Public
32B
FP8
32768
Jan 20, 2025
License: apache-2.0
Hugging Face
Overview

DeepSeek-R1-Distill-Qwen-32B Overview

DeepSeek-R1-Distill-Qwen-32B is a 32 billion parameter model from DeepSeek AI, distilled from their larger DeepSeek-R1 reasoning model and built upon the Qwen2.5 base. This model is part of a series designed to transfer the advanced reasoning patterns of large-scale models into more efficient, smaller architectures. DeepSeek-R1 itself was developed using a novel reinforcement learning (RL) approach, initially without supervised fine-tuning (SFT), to foster emergent reasoning behaviors like self-verification and chain-of-thought generation.

Key Capabilities

  • Advanced Reasoning: Excels in complex reasoning tasks, inheriting capabilities from the DeepSeek-R1 parent model.
  • Mathematical Proficiency: Achieves high scores on benchmarks like AIME 2024 (72.6% pass@1) and MATH-500 (94.3% pass@1).
  • Code Generation: Demonstrates strong performance in coding challenges, with a LiveCodeBench pass@1 of 57.2% and a CodeForces rating of 1691.
  • Distilled Performance: Outperforms OpenAI-o1-mini in several benchmarks, showcasing the effectiveness of its distillation process.
  • Extended Context: Supports a context length of 32,768 tokens.

When to Use This Model

This model is particularly well-suited for applications requiring robust reasoning, mathematical problem-solving, and code generation. Its distilled nature makes it a powerful option for scenarios where the performance of larger models is needed but with the efficiency benefits of a 32B parameter model. It is recommended for tasks demanding high accuracy in complex cognitive domains.