Jackrong/Qwen3.5-4B-Claude-4.6-Opus-Reasoning-Distilled

VISIONConcurrency Cost:1Model Size:4.5BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:Mar 3, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Jackrong/Qwen3.5-4B-Claude-4.6-Opus-Reasoning-Distilled is a 4.5 billion parameter model developed by Jackrong, fine-tuned on the Qwen3.5-4B architecture. It specializes in structured reasoning and problem-solving, leveraging Chain-of-Thought (CoT) distillation from Claude-4.6 Opus interactions. The model excels at breaking down complex problems and delivering precise solutions, making it suitable for tasks requiring transparent, step-by-step logic.

Loading preview...

Overview

Jackrong/Qwen3.5-4B-Claude-4.6-Opus-Reasoning-Distilled is a 4.5 billion parameter language model built on the Qwen3.5-4B architecture, developed by Jackrong. This model is specifically fine-tuned for advanced reasoning capabilities, utilizing Chain-of-Thought (CoT) distillation from high-quality Claude-4.6 Opus interactions. It aims to provide structured, step-by-step problem-solving, particularly by enforcing an internal thinking process within <think> tags before generating a final answer.

Key Capabilities

  • Structured Reasoning: Employs a streamlined reasoning paradigm, adopting an efficient "Let me analyze this request carefully: 1..2..3..." pattern to reduce redundant cognitive loops.
  • CoT Distillation: Leverages Supervised Fine-Tuning (SFT) with response-only training, masking instructions to focus loss calculation purely on the generation of <think> sequences and subsequent solutions.
  • Enhanced Reasoning Data: Further improved with additional reasoning data distilled from Qwen3.5-27B, including datasets like Jackrong/Qwen3.5-reasoning-700x, to strengthen structured problem-solving and reasoning diversity.
  • Extended Context: Supports a 16,384 token context window, allowing for complex multi-step reasoning traces.
  • Performance: Shows improved performance over baseline 4B models on benchmarks like GPQA Diamond (38.88%) and AI2 ARC-Challenge (66.38%).

Good For

  • Offline analytical tasks requiring transparent, step-by-step logic.
  • Coding and mathematical problem-solving.
  • Heavy logic-dependent prompting where understanding the AI's internal thought process is crucial.

Limitations

  • As an autoregressive LLM, it carries a risk of hallucination, especially when verifying real-world events within its thinking sequence.
  • Intended primarily for academic research and technical exploration.