unsloth/DeepSeek-R1-Distill-Qwen-32B
TEXT GENERATIONConcurrency Cost:2Model Size:32BQuant:FP8Ctx Length:32kPublished:Jan 20, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

The DeepSeek-R1-Distill-Qwen-32B model, developed by DeepSeek AI, is a 32 billion parameter language model distilled from the larger DeepSeek-R1 reasoning model and based on the Qwen2.5 architecture. It is specifically optimized for complex reasoning, mathematical, and coding tasks, demonstrating strong performance across various benchmarks. This model leverages advanced distillation techniques to transfer the reasoning capabilities of a larger model into a more compact form, making it suitable for applications requiring high-level cognitive abilities.

Loading preview...

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p