Name: deepseek-ai/DeepSeek-R1-0528-Qwen3-8B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: deepseek-ai

DeepSeek-R1-0528-Qwen3-8B: Enhanced Reasoning in a Compact Model

DeepSeek-R1-0528-Qwen3-8B is an 8 billion parameter model from DeepSeek-AI, representing a significant advancement in reasoning capabilities for smaller-scale language models. It is a distilled version of the more powerful DeepSeek-R1-0528, leveraging its chain-of-thought processes to enhance the Qwen3 8B base architecture. This model focuses on improving depth of reasoning and inference through algorithmic optimizations and increased computational resources during post-training.

Key Capabilities & Performance:

Superior Reasoning: Demonstrates outstanding performance in complex reasoning tasks across mathematics, programming, and general logic. For instance, it achieves 86.0% on AIME 2024, surpassing Qwen3 8B by 10.0% and matching Qwen3-235B-thinking.
Reduced Hallucination: Offers a lower hallucination rate compared to previous versions, leading to more reliable outputs.
Enhanced Function Calling: Provides improved support for function calling, making it more versatile for tool-use applications.
Code & Math Proficiency: Shows strong performance in coding benchmarks like LiveCodeBench (60.5%) and various math competitions (e.g., 76.3% on AIME 2025).
Qwen3 Compatibility: Shares the same model architecture as Qwen3-8B but utilizes the DeepSeek-R1-0528 tokenizer configuration.

Good for:

Academic Research: Particularly valuable for research into reasoning models and chain-of-thought distillation.
Industrial Development: Ideal for integrating advanced reasoning capabilities into small-scale applications where efficiency and performance are critical.
Complex Problem Solving: Excels in scenarios requiring deep logical inference, such as mathematical problem-solving and code generation.
Applications requiring Function Calling: Its enhanced function calling support makes it suitable for agentic workflows.