WhirlwindAI/Qwen-R1-0.5B
WhirlwindAI/Qwen-R1-0.5B is a 0.5 billion parameter language model based on Qwen2.5-0.5B-Instruct, fine-tuned by WhirlwindAI. This model is specifically designed to perform explicit chain-of-thought reasoning by generating a thinking process before providing an answer. It excels at transparent and debuggable responses, making it suitable for applications requiring clear, step-by-step reasoning from a compact model.
Loading preview...
Overview
WhirlwindAI/Qwen-R1-0.5B is a compact 0.5 billion parameter language model, fine-tuned from Qwen2.5-0.5B-Instruct by WhirlwindAI. Its core innovation lies in its ability to reason before it answers, explicitly generating a thought process using <thinking> tags. This approach aims to make the model's responses more transparent, reliable, and easier to debug by separating the reasoning steps from the final answer.
Key Capabilities
- Explicit Chain-of-Thought: The model is trained to structure its output with a
<thinking>{reasoning}</thinking>{answer}format, ensuring a clear, step-by-step thought process. - Enhanced Transparency: By showing its reasoning, the model provides insights into how it arrived at a particular answer, improving user trust and debuggability.
- Consistent Formatting: Demonstrates excellent consistency in adhering to the
<thinking>tag format. - General Knowledge Retention: Successfully retains general knowledge from its base model.
Training Details
The model was fine-tuned using QLoRA (4-bit) on the WhirlwindAI/Soft-CoT-1K dataset, comprising 1,355 examples over 3 epochs. This targeted training focused on instilling the 'reason first, answer second' behavior.
Performance & Limitations
Evaluations show excellent performance in maintaining the required thinking tag format and good performance in general knowledge and creative reasoning. However, the model currently needs improvement in multi-step math, logic, physics, and science reasoning, and may sometimes hallucinate facts. It is best suited for tasks where explicit, transparent reasoning is prioritized over complex mathematical or scientific accuracy in a small model footprint.