Name: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: deepseek-ai

DeepSeek-R1-Distill-Qwen-1.5B Overview

DeepSeek-R1-Distill-Qwen-1.5B is a 1.5 billion parameter language model developed by DeepSeek-AI. It is part of a series of distilled models derived from the larger DeepSeek-R1, which was trained using a novel large-scale reinforcement learning (RL) approach to incentivize reasoning capabilities without initial supervised fine-tuning (SFT).

Key Capabilities & Features

Reasoning Distillation: This model benefits from reasoning patterns distilled from the powerful DeepSeek-R1, which itself demonstrated advanced reasoning behaviors like self-verification and reflection.
Optimized for Math & Reasoning: Fine-tuned with reasoning data, it shows strong performance in mathematical benchmarks such as AIME 2024 (28.9% pass@1) and MATH-500 (83.9% pass@1).
Efficient Size: At 1.5 billion parameters, it offers a more compact solution for deploying reasoning-focused applications compared to larger models.
Qwen2.5 Base: Built upon the Qwen2.5-Math-1.5B architecture, leveraging its foundational capabilities.

When to Use This Model

Resource-constrained environments: Ideal for applications requiring strong reasoning in a smaller footprint.
Mathematical problem-solving: Particularly suited for tasks involving complex mathematical reasoning.
Research into reasoning distillation: Useful for exploring how advanced reasoning can be transferred to smaller models.