Name: open-r1/OpenR1-Distill-7B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: open-r1

OpenR1-Distill-7B: Reasoning Capabilities from DeepSeek-R1

OpenR1-Distill-7B is a 7.6 billion parameter language model developed by open-r1, post-trained from a modified version of Qwen/Qwen2.5-Math-7B. Its core innovation lies in its training on the Mixture-of-Thoughts dataset, a curated collection of 350,000 verified reasoning traces distilled from the larger DeepSeek-R1 model.

Key Capabilities & Features

Enhanced Reasoning: Specifically trained to perform step-by-step reasoning across diverse domains including mathematics, coding, and science.
DeepSeek-R1 Replication: Aims to reproduce the reasoning performance of DeepSeek-R1 in an open and reproducible 7B parameter model.
Extended Context: Built upon a Qwen2.5-Math-7B variant with its RoPE base frequency extended to 300k, enabling training on a context of 32k tokens.
Performance Benchmarks: Achieves competitive scores on reasoning benchmarks such as AIME 2024 (52.7), MATH-500 (89.0), GPQA Diamond (52.8), and LiveCodeBench v5 (39.4), closely matching or exceeding DeepSeek-R1-Distill-Qwen-7B.

Ideal Use Cases

Research on Inference-Time Compute: Provides a strong baseline for exploring efficient reasoning at inference.
Reinforcement Learning with Verifiable Rewards (RLVR): Suitable for developing and testing RL systems that require verifiable reasoning steps.
Mathematical and Scientific Problem Solving: Excels in tasks requiring logical deduction and multi-step solutions.
Code Generation and Analysis: Demonstrates proficiency in coding-related reasoning challenges.