unsloth/Qwen3.5-9B-Base

Hugging Face
VISIONConcurrency Cost:1Model Size:9BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Mar 2, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

The unsloth/Qwen3.5-9B-Base is a 9 billion parameter causal language model developed by Qwen, featuring a unified vision-language foundation and an efficient hybrid architecture. It integrates multimodal learning and architectural efficiency, excelling in reasoning, coding, agents, and visual understanding benchmarks. With a native context length of 262,144 tokens, extensible up to 1,010,000, it is designed for fine-tuning, in-context learning, and research purposes.

Loading preview...

Qwen3.5-9B-Base Overview

Qwen3.5-9B-Base is a 9 billion parameter causal language model developed by Qwen, distinguished by its unified vision-language foundation and efficient hybrid architecture. This model integrates significant advancements in multimodal learning, architectural efficiency, and reinforcement learning, making it a powerful tool for various AI applications.

Key Capabilities & Features

  • Unified Vision-Language Foundation: Achieves cross-generational parity with Qwen3 and surpasses Qwen3-VL models across reasoning, coding, agent tasks, and visual understanding benchmarks through early fusion training on multimodal tokens.
  • Efficient Hybrid Architecture: Utilizes Gated Delta Networks combined with sparse Mixture-of-Experts to deliver high-throughput inference with minimal latency and cost.
  • Scalable RL Generalization: Features reinforcement learning scaled across millions of agent environments with progressively complex task distributions, enhancing real-world adaptability.
  • Global Linguistic Coverage: Supports 201 languages and dialects, enabling broad deployment with nuanced cultural and regional understanding.
  • Extended Context Length: Natively supports 262,144 tokens, extensible up to 1,010,000 tokens.

Intended Use Cases

This pre-trained model is primarily intended for:

  • Fine-tuning for specific downstream tasks.
  • In-context learning experiments.
  • General research and development purposes.

It is compatible with Hugging Face Transformers, vLLM, and SGLang, and its control tokens are optimized for efficient LoRA-style PEFT, mitigating the need for embedding fine-tuning despite a larger vocabulary.