Qwen3.5-2B-Claude-4.6-Opus-Reasoning-Distilled Overview

This model is a 2.3 billion parameter language model, built upon the Qwen3.5-2B base, and specifically fine-tuned for enhanced reasoning capabilities. It incorporates advanced Chain-of-Thought (CoT) distillation techniques, primarily sourced from Claude-4.6 Opus interactions, to foster structured, step-by-step problem-solving.

Key Enhancements & Capabilities

Reasoning Distillation: Further enhanced with high-quality reasoning trajectories distilled from Qwen3.5-27B, improving performance in science, instruction-following, and mathematics.
Structured Thinking: Employs a streamlined reasoning paradigm, adopting an efficient structured thinking pattern (e.g., "Let me analyze this request carefully: 1..2..3...") to reduce redundant cognitive loops.
SFT with Unsloth: Utilizes Supervised Fine-Tuning (SFT) with Unsloth for memory and compute optimization, focusing on training the model to generate internal <think> sequences before producing final answers.
Extended Context: Supports an extended context window of 16,384 tokens, allowing for complex multi-step reasoning within memory limits.
Dataset Integration: Trained on curated datasets like nohurry/Opus-4.6-Reasoning-3000x-filtered and Jackrong/Qwen3.5-reasoning-700x to strengthen structured problem-solving and reasoning diversity.

Intended Use Cases

This model is best suited for offline analytical tasks, coding, mathematical problem-solving, and other logic-dependent prompting scenarios where transparent, step-by-step internal logic is beneficial. It is a test version for academic research and technical exploration.

Overview

Qwen3.5-2B-Claude-4.6-Opus-Reasoning-Distilled Overview

Key Enhancements & Capabilities

Intended Use Cases

Full Model Card (README)