Name: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: deepseek-ai

DeepSeek-R1-Distill-Qwen-7B: Reasoning Capabilities in a Compact Model

DeepSeek-R1-Distill-Qwen-7B is a 7.6 billion parameter model from DeepSeek AI, part of the DeepSeek-R1 series. It is a distilled version of the larger DeepSeek-R1, fine-tuned on reasoning patterns generated by its predecessor, and built upon the Qwen2.5-Math-7B base model. This approach demonstrates that complex reasoning capabilities can be effectively transferred to smaller, dense models.

Key Capabilities

Enhanced Reasoning: Benefits from distillation of advanced reasoning patterns, showing strong performance in math, code, and general reasoning benchmarks.
Long Context Understanding: Supports a substantial context length of 131,072 tokens, enabling processing of extensive inputs.
Performance: Achieves competitive results across various benchmarks, including AIME 2024 (55.5% pass@1), MATH-500 (92.8% pass@1), and LiveCodeBench (37.6% pass@1).

Good For

Complex Problem Solving: Ideal for tasks requiring step-by-step reasoning, such as mathematical proofs, code generation, and logical deduction.
Research and Development: Provides a powerful, open-source foundation for further research into model distillation and reasoning capabilities.
Applications with Long Contexts: Suitable for use cases where processing and understanding very long documents or conversations are critical.

Overview

DeepSeek-R1-Distill-Qwen-7B: Reasoning Capabilities in a Compact Model

Key Capabilities

Good For

Full Model Card (README)