Jackrong/Qwopus3.5-9B-v3

VISIONConcurrency Cost:1Model Size:9BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Mar 30, 2026License:apache-2.0Architecture:Transformer0.1K Open Weights Cold

Jackrong/Qwopus3.5-9B-v3 is a 9 billion parameter reasoning-enhanced model based on Qwen3.5-9B, optimized for structural reasoning and tool-calling reinforcement with a 32K context length. It is designed for complex, multi-step agentic workflows, shifting from "reason-then-act" to an "act-then-refine" paradigm. This model excels in programming tasks, achieving 87.80% on HumanEval, and improves reasoning efficiency by 31.7% while reducing cost per correct answer by 24.0%.

Loading preview...

Qwopus3.5-9B-v3: Reasoning-Enhanced LLM for Agentic Workflows

Jackrong/Qwopus3.5-9B-v3 is a 9 billion parameter model built upon Qwen3.5-9B, specifically engineered to enhance reasoning stability, correctness, and inference efficiency, particularly for programming tasks. It introduces an "act-then-refine" paradigm, prioritizing execution-driven optimization over deep pre-execution reasoning for multi-step agent systems.

Key Capabilities & Differentiators

  • Structural Reasoning Optimization: Moves beyond traditional CoT distillation by focusing on verifiable, explicit reasoning chains, improving faithfulness and generalization. This results in more stable and accurate reasoning paths.
  • Tool-Calling Reinforcement: Incorporates specialized Reinforcement Learning (RL) training to strengthen stability and proficiency in tool invocation within agent frameworks like OpenClaw.
  • Improved Reasoning Efficiency: Achieves a 25.3% shorter average reasoning length and 31.7% higher efficiency, leading to a 24.0% lower cost per correct answer compared to Qwen3.5-9B.
  • Strong Programming Performance: Attains a base pass@1 of 87.80% on the HumanEval benchmark, outperforming Qwen3.5-9B (82.93%) and Claude-Distilled-v2 (82.32%).
  • MMLU-Pro Improvement: Shows a modest but significant +1.43 pp accuracy lead over Qwen3.5-9B on the MMLU-Pro benchmark, achieving 81.79%.

Good For

  • Offline Analytical Tasks: Excels in scenarios requiring transparent, step-by-step logical processing.
  • Coding and Mathematical Reasoning: Demonstrated strong performance on HumanEval and MMLU-Pro, making it suitable for programming and logic-heavy applications.
  • Agentic Workflows: Designed to support complex, multi-step agent systems that benefit from iterative refinement and tool use.

This model was fine-tuned using Unsloth and a high-fidelity reasoning dataset, with a focus on process-level reasoning learning rather than mere answer imitation.