deepseek-ai/DeepSeek-R1-0528-Qwen3-8B
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:May 29, 2025License:mitArchitecture:Transformer1.0K Open Weights Warm

DeepSeek-R1-0528-Qwen3-8B is an 8 billion parameter language model developed by DeepSeek-AI, distilled from the DeepSeek-R1-0528 model's chain-of-thought and built upon the Qwen3 8B architecture. This model is specifically optimized for enhanced reasoning and inference capabilities, demonstrating significant improvements in mathematics, programming, and general logic tasks. It achieves state-of-the-art performance among open-source models on benchmarks like AIME 2024, making it suitable for complex problem-solving applications.

Loading preview...

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p