hxssgaa/Qwen3-VL-8B-Interleave-Thinking

VISIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Jan 1, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Qwen3-VL-8B-Interleave-Thinking is an 8 billion parameter agentic model developed by hxssgaa, fine-tuned from Qwen/Qwen3-VL-8B-Thinking. It specializes in interleaved thinking, generating internal thought processes before executing function calls, and supports long-horizon function calling. This model is optimized for complex, multi-step tasks requiring robust tool use and reasoning chains.

Loading preview...

Overview

Qwen3-VL-8B-Interleave-Thinking is an 8 billion parameter agentic model, fine-tuned by hxssgaa from the Qwen3-VL-8B-Thinking base model. It is specifically designed to emulate the behavior of agent SDKs, focusing on advanced reasoning and tool utilization. The model's core innovation lies in its interleaved thinking capability, where it generates a detailed thought process before making function calls, enhancing planning and error correction.

Key Capabilities

  • Interleaved Thinking: Generates internal reasoning traces prior to executing function calls, improving decision-making and task planning.
  • Long-Horizon Function Calling: Capable of managing complex, multi-step tasks by maintaining a coherent thought process across interactions.
  • Agentic Focus: Optimized for scenarios requiring sophisticated tool use and strategic decision-making on why and how to employ tools.

Training and Use Cases

This model was fine-tuned on the xlam-interleave-thinking-40k dataset, which contains 40,000 high-quality examples distilled from MiniMax M2.1, ensuring a rigorous thinking pattern for autonomous agents. It is particularly well-suited for applications requiring robust agentic behaviors, complex problem-solving, and scenarios where explicit reasoning steps are beneficial for task execution and debugging. The current v0.1 release is based on Supervised Fine-Tuning (SFT), with future plans for large-scale Reinforcement Learning (RL) to further refine its agentic policies.