Qwen/Qwen3.5-397B-A17B
Hugging Face
VISIONConcurrency Cost:4Model Size:397BQuant:FP8Ctx Length:32kPublished:Feb 16, 2026License:apache-2.0Architecture:Transformer1.5K Open Weights Warm

Qwen3.5-397B-A17B is a multimodal causal language model developed by Qwen, featuring 397 billion total parameters with 17 billion activated. This model integrates a unified vision-language foundation and an efficient hybrid architecture, enabling cross-generational parity with Qwen3 and outperforming Qwen3-VL models across various benchmarks. It is designed for robust real-world adaptability, excelling in multimodal understanding, reasoning, coding, and agentic tasks with a native context length of 262,144 tokens, extensible up to 1,010,000 tokens.

Loading preview...

Qwen3.5-397B-A17B: A Multimodal Agent Foundation Model

Qwen3.5-397B-A17B is a powerful multimodal causal language model developed by Qwen, featuring a total of 397 billion parameters with 17 billion activated. This model represents a significant advancement, integrating a unified vision-language foundation and an efficient hybrid architecture that combines Gated Delta Networks with sparse Mixture-of-Experts for high-throughput inference.

Key Capabilities

  • Unified Vision-Language Understanding: Achieves strong performance across reasoning, coding, agentic tasks, and visual understanding benchmarks, outperforming previous Qwen3 and Qwen3-VL models.
  • Efficient Hybrid Architecture: Utilizes Gated Delta Networks and sparse Mixture-of-Experts for optimized inference speed and cost.
  • Scalable RL Generalization: Benefits from reinforcement learning scaled across millions of agent environments, enhancing real-world adaptability.
  • Extensive Multilingual Support: Offers expanded linguistic coverage for 201 languages and dialects, facilitating global deployment.
  • Ultra-Long Context: Natively supports a context length of 262,144 tokens, extensible up to 1,010,000 tokens using techniques like YaRN.
  • Agentic Functionality: Excels in tool calling, supporting frameworks like Qwen-Agent and Qwen Code for building advanced agent applications.

Good For

  • Complex Multimodal Tasks: Ideal for applications requiring deep understanding and generation across both text and visual inputs, including video.
  • High-Performance Inference: Suitable for scenarios demanding efficient and low-latency model serving, especially with recommended frameworks like SGLang and vLLM.
  • Global Applications: Its broad linguistic support makes it well-suited for international deployments and diverse user bases.
  • Agent Development: A strong choice for building sophisticated AI agents that can interact with tools and environments effectively.