ermiaazarkhalili/Qwen3.5-9B-SFT-Claude-Opus-Reasoning-Unsloth

VISIONConcurrency Cost:1Model Size:9BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Apr 25, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

The ermiaazarkhalili/Qwen3.5-9B-SFT-Claude-Opus-Reasoning-Unsloth is a 9 billion parameter Qwen3.5-based language model fine-tuned by ermiaazarkhalili. It is specifically optimized for reasoning distillation and chain-of-thought learning, leveraging the Unsloth framework for efficient training. This model excels at tasks requiring step-by-step problem-solving and logical deduction, trained on Claude's reasoning traces. It features a 2048-token context window and is available in GGUF formats for diverse deployment.

Loading preview...

Model Overview

This model, ermiaazarkhalili/Qwen3.5-9B-SFT-Claude-Opus-Reasoning-Unsloth, is a 9 billion parameter Qwen3.5 variant developed by ermiaazarkhalili. It has been fine-tuned using the Unsloth framework, which enabled 2x faster training and 60% less VRAM consumption. The primary objective of this fine-tuning was to enhance the model's reasoning capabilities through distillation.

Key Capabilities & Training Details

  • Reasoning Distillation: Optimized for chain-of-thought learning by training on the claude-reasoning-distillation dataset, which includes 10,477 samples of Claude's reasoning traces with <think> blocks.
  • Efficient Fine-tuning: Utilizes Unsloth and QLoRA (4-bit) for efficient SFT, targeting modules like q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, and down_proj.
  • Context Length: Fine-tuned with a 2,048-token context window.
  • GGUF Availability: Quantized GGUF versions (e.g., Q4_K_M, Q5_K_M, Q8_0) are provided for CPU and edge inference, compatible with tools like Ollama and llama.cpp.

Ideal Use Cases

  • Complex Problem Solving: Suited for tasks that benefit from explicit, step-by-step reasoning.
  • Educational Applications: Can be used for generating detailed explanations or solving logical puzzles.
  • Research & Development: A strong base for further experimentation in reasoning-focused AI applications.

Limitations

  • Primarily trained on English data.
  • Knowledge cutoff is limited to the base model's training data.
  • Not extensively safety-tuned, requiring external guardrails for sensitive applications.