iAmBoosted/Qwen3.5-9B-OSS-Distilled
iAmBoosted/Qwen3.5-9B-OSS-Distilled is a 9 billion parameter language model distilled from Qwen/Qwen3.5-9B, specifically fine-tuned to improve reasoning termination behavior. It addresses the issue of the base model spiraling on complex prompts by adopting the concise reasoning style of openai/gpt-oss-20b. This model excels at reliably completing math, science, code, and logic prompts, making it suitable for applications requiring consistent, terminating reasoning. It maintains a 32768 token context length while significantly reducing the 'no-answer' rate from 36.2% to 0.5%.
Loading preview...
Overview
iAmBoosted/Qwen3.5-9B-OSS-Distilled is a 9 billion parameter model derived from Qwen/Qwen3.5-9B, with a primary focus on improving reasoning termination. The base Qwen3.5-9B often 'spirals' on challenging prompts, failing to produce an answer. This distilled version was fine-tuned using supervised fine-tuning (LoRA) with traces from openai/gpt-oss-20b, specifically to adopt a tight, terminating reasoning style.
Key Improvements & Capabilities
- Reduced 'No-Answer' Rate: The model drastically cuts the 'no-answer' rate on hard prompts from 36.2% to 0.5%, ensuring reliable output.
- Improved Reasoning Style: A blind A/B judge preferred this model's answers 60.3% of the time (excluding ties) over the stock Qwen3.5-9B, indicating a cleaner and more effective reasoning process.
- Text-Only Focus: While the base Qwen3.5-9B is a vision-language model, this distillation focused exclusively on text-only reasoning, and its multimodal capabilities remain untested post-fine-tuning.
Intended Use Cases
- Reliable Reasoning: Ideal for applications requiring consistent and terminating reasoning in domains like math, science, code, and logic.
- Behavioral Fix: This model provides a behavioral fix for the base Qwen3.5-9B's tendency to get stuck in reasoning loops.
Limitations
- Style, Not Capability Upgrade: This fine-tune improves reasoning style and termination, but does not add new knowledge or increase raw problem-solving capability beyond the base model.
- Untested Multimodal: Any multimodal functionality of the base model is untested and should not be relied upon in this distilled version.
- Puzzle Domain: The tighter reasoning style may not be optimal for exploratory puzzle prompts, where the baseline model was sometimes preferred.