ScottzillaSystems/Qwen3.5-9B
Qwen3.5-9B is a 9 billion parameter causal language model developed by Qwen, featuring a unified vision-language foundation and an efficient hybrid architecture. It excels in multimodal understanding, reasoning, coding, and agent capabilities, supporting a native context length of 262,144 tokens, extensible up to 1,010,000 tokens. This model is particularly strong in general agent tasks and global linguistic coverage, supporting 201 languages and dialects.
Loading preview...
Qwen3.5-9B: A Multimodal Agent Foundation Model
Qwen3.5-9B is a 9 billion parameter multimodal large language model developed by Qwen, designed for exceptional utility and performance across various tasks. It integrates significant advancements in multimodal learning, architectural efficiency, and reinforcement learning.
Key Capabilities & Features
- Unified Vision-Language Foundation: Achieves strong performance in reasoning, coding, agent tasks, and visual understanding through early fusion training on multimodal tokens.
- Efficient Hybrid Architecture: Utilizes Gated Delta Networks combined with sparse Mixture-of-Experts for high-throughput inference with minimal latency.
- Scalable RL Generalization: Features reinforcement learning scaled across millions of agent environments, enhancing real-world adaptability.
- Global Linguistic Coverage: Supports 201 languages and dialects, enabling broad deployment with nuanced cultural understanding.
- Extended Context Length: Natively handles up to 262,144 tokens, extensible to 1,010,000 tokens using YaRN scaling techniques.
- Tool Calling: Demonstrates strong capabilities in tool calling, with recommended integration via Qwen-Agent or Qwen Code.
Performance Highlights
Qwen3.5-9B shows competitive and often superior performance across various benchmarks:
- Language: Achieves 82.5 on MMLU-Pro, 91.1 on MMLU-Redux, and 88.2 on C-Eval.
- Instruction Following: Scores 91.5 on IFEval and 64.5 on IFBench.
- Long Context: Attains 63.0 on AA-LCR and 55.2 on LongBench v2.
- Vision Language: Excels in multimodal benchmarks, scoring 78.4 on MMMU, 78.9 on MathVision, and 90.1 on MMBench.
- General Agent: Achieves 66.1 on BFCL-V4 and 79.1 on TAU2-Bench, indicating strong agentic capabilities.
When to Use This Model
Qwen3.5-9B is ideal for applications requiring:
- Multimodal understanding: Processing and reasoning over both text and image/video inputs.
- Complex agentic tasks: Building AI agents that can interact with environments and use tools effectively.
- Long context processing: Handling extensive documents or conversations with its large context window.
- Global applications: Leveraging its broad multilingual support for diverse user bases.