trjxter/Qwimi3.5-9B-Kimik2.6-Opus-Distill-MTP-BF16

VISIONConcurrency Cost:1Model Size:9BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:May 22, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

trjxter/Qwimi3.5-9B-Kimik2.6-Opus-Distill-MTP-BF16 is a 9 billion parameter merged BF16 causal language model developed by trjxter, fine-tuned from unsloth/Qwen3.5-9B. This model is specifically optimized for structured reasoning, mathematical tasks, and long-context problem-solving, leveraging a curated dataset of Kimi K2.6, Qwen reasoning, and Claude Opus TraceInversion data. It excels at generating step-by-step reasoning traces within a 32768 token context window, making it suitable for complex analytical prompts.

Loading preview...

Overview

trjxter/Qwimi3.5-9B-Kimik2.6-Opus-Distill-MTP-BF16 is a 9 billion parameter merged BF16 causal language model, fine-tuned by trjxter from unsloth/Qwen3.5-9B. This model was developed with a primary focus on enhancing structured reasoning behavior and preserving Qwen-style chat formatting, including <think>...</think> reasoning traces.

Key Capabilities

  • Enhanced Reasoning: Specifically fine-tuned to improve structured reasoning, mathematical problem-solving, and technical reasoning.
  • Long Context: Supports a context length of 32768 tokens, making it suitable for complex, multi-step problems.
  • Distillation Approach: Utilizes a unique distillation process from high-quality reasoning datasets, including Kimi K2.6, Qwen reasoning, and Claude Opus TraceInversion data.
  • Qwen Chat Format: Maintains the familiar Qwen chat template, facilitating integration and consistent interaction.

Training Details

The model was trained using Unsloth and Hugging Face TRL with a LoRA-based supervised fine-tuning setup. It processed 12,000 training examples over 1 epoch, achieving a final training loss of 0.5517. The dataset was carefully curated and normalized into Qwen chat format, preserving assistant reasoning traces.

Intended Use

This model is ideal for experimentation in reasoning-style SFT, synthetic distillation, and exploring long-context reasoning behavior. It is particularly well-suited for tasks involving math, structured problem-solving, and coding/technical reasoning prompts. Users should note this is an experimental fine-tune and evaluate its outputs carefully.