Qwopus3.5-9B-v3.5 Overview

Qwopus3.5-9B-v3.5 is a 9 billion parameter model, a data-scaled continuation of the Qwopus3.5-9B-v3, built upon the Qwen3.5-9B architecture. Its primary focus is on enhancing structured reasoning capabilities through an expanded and high-quality Supervised Fine-Tuning (SFT) dataset, approximately double the size of its predecessor. This version does not introduce new architecture or RL stages but leverages data scaling to improve generalization.

Key Capabilities & Design Principles

Reasoning Enhancement: Designed for structured reasoning, puzzle-solving, and STEM-related tasks, aiming to better utilize and activate latent knowledge.
Agentic Workflows: Optimized for tool-augmented workflows and multi-step agentic tasks, including code inspection and bug diagnosis.
Broad Domain Coverage: Training data covers mathematics, programming, multilingual dialogue, and instruction-following.
Efficiency: Engineered for token-efficient inference.

Generalization & Performance Insights

Motivated by the hypothesis that scaling high-quality SFT data enhances generalization, this model aims to learn reasoning procedures rather than just output formats. While a dedicated public benchmark report for the 9B model is not yet available, methodology references from the 27B line suggest improvements in multi-step reasoning tasks like MATH500, MMLU-Pro, HumanEval, and GSM8K. Preliminary evaluations on a subset of MMLU-Pro for the 27B line showed a +1.07 percentage point gain with v3.5, and SWE-style agentic coding tests demonstrated improved performance in multi-step agentic coding tasks.

Limitations

Potential limitations include possible overfitting if data scaling exceeds optimal regimes, instability in edge-case reasoning, and dependency of tool-calling performance on environment integration. Not all capabilities are fully benchmarked.

Overview

Qwopus3.5-9B-v3.5 Overview

Key Capabilities & Design Principles

Generalization & Performance Insights

Limitations

Full Model Card (README)