RoyceLu/OpenR1-Distill-0.6B is a 0.8 billion parameter Qwen3-0.6B-Base model fine-tuned using the Open-R1 supervised distillation recipe on the Mixture-of-Thoughts dataset. This model is specifically optimized for reasoning-oriented text generation, demonstrating improved performance on benchmarks like AIME 2024 and GPQA Diamond compared to its base model. It features a 32768 token context length and is intended for applications requiring robust reasoning capabilities.
Loading preview...
OpenR1-Distill-0.6B: Reasoning-Oriented Distilled Model
OpenR1-Distill-0.6B is a 0.8 billion parameter language model developed by RoyceLu, based on the Qwen/Qwen3-0.6B-Base architecture. It has been fine-tuned using the Open-R1 supervised distillation recipe, leveraging the open-r1/Mixture-of-Thoughts dataset, which involved training on 349,317 samples over 5 epochs.
Key Capabilities & Features
- Reasoning Enhancement: Specifically fine-tuned for reasoning-oriented text generation, aiming to improve logical processing and problem-solving.
- Extended Context Window: Supports a maximum sequence length of 32768 tokens, allowing for processing longer inputs and generating more extensive outputs.
- Distillation Approach: Utilizes a supervised distillation recipe, transferring knowledge from a larger model (implied by Open-R1 recipe) to a smaller, more efficient 0.8B parameter model.
- Benchmark Improvements: Shows performance gains over its base model on benchmarks such as AIME 2024 (up to +0.73 pp) and GPQA Diamond (up to +1.89 pp) in specific
max_new_tokenssettings, indicating enhanced reasoning and general knowledge capabilities.
Good For
- Reasoning Tasks: Ideal for applications requiring logical deduction, problem-solving, and complex question answering.
- Resource-Constrained Environments: As a 0.8B parameter model, it offers a more efficient solution for reasoning tasks compared to larger models, while maintaining a substantial context window.
- Text Generation: Suited for generating coherent and contextually relevant text, particularly in scenarios where reasoning is a primary requirement.