Qwen3.5-9B: A Multimodal Agent Foundation Model

Qwen3.5-9B is a 9 billion parameter multimodal large language model developed by Qwen, designed for exceptional utility and performance. It integrates advancements in multimodal learning, architectural efficiency, and reinforcement learning to deliver robust capabilities.

Key Capabilities & Features

Unified Vision-Language Foundation: Achieves strong performance across reasoning, coding, agent tasks, and visual understanding by early fusion training on multimodal tokens.
Efficient Hybrid Architecture: Utilizes Gated Delta Networks combined with sparse Mixture-of-Experts for high-throughput inference with minimal latency.
Scalable RL Generalization: Features reinforcement learning scaled across million-agent environments for robust real-world adaptability.
Global Linguistic Coverage: Supports 201 languages and dialects, enabling inclusive worldwide deployment.
Ultra-Long Context: Natively handles up to 262,144 tokens, extensible to 1,010,000 tokens using YaRN scaling techniques.
Multimodal Input: Supports text, image, and video inputs.

Performance Highlights

Qwen3.5-9B demonstrates strong benchmark results, often outperforming previous Qwen3 models and competitive alternatives in its size class across various domains:

Language: Achieves 82.5 on MMLU-Pro, 88.2 on C-Eval, and 91.5 on IFEval.
Vision Language: Scores 78.4 on MMMU, 78.9 on MathVision, and 90.1 on MMBench (EN-DEV-v1.1).
Agentic Capabilities: Shows strong performance in general agent benchmarks like BFCL-V4 (66.1) and TAU2-Bench (79.1), and excels in tool calling with 45.6 on TIR-Bench.

Good for

Developing multimodal applications requiring advanced reasoning and visual understanding.
Applications needing extensive language support and long context processing.
Building AI agents with strong tool-calling capabilities.
High-throughput inference scenarios where efficiency is critical.

Overview

Qwen3.5-9B: A Multimodal Agent Foundation Model

Key Capabilities & Features

Performance Highlights

Good for

Full Model Card (README)