Name: Qwen/Qwen3.5-397B-A17B API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: Qwen

Qwen3.5-397B-A17B: A Unified Multimodal Agent

Qwen3.5-397B-A17B is a powerful causal language model with a vision encoder from Qwen, designed for advanced multimodal and agentic applications. It boasts a total of 397 billion parameters, with 17 billion activated, and features an efficient hybrid architecture that includes Gated Delta Networks and sparse Mixture-of-Experts for optimized inference performance. The model natively supports a substantial context length of 262,144 tokens, which can be extended up to 1,010,000 tokens using YaRN scaling techniques, enabling it to handle ultra-long texts and complex tasks.

Key Capabilities

Unified Vision-Language Foundation: Achieves strong performance across reasoning, coding, agents, and visual understanding benchmarks through early fusion training on multimodal tokens.
Efficient Hybrid Architecture: Employs Gated Delta Networks and sparse Mixture-of-Experts for high-throughput inference with reduced latency and cost.
Scalable RL Generalization: Benefits from reinforcement learning scaled across millions of agent environments, enhancing real-world adaptability.
Global Linguistic Coverage: Supports 201 languages and dialects, facilitating inclusive worldwide deployment.
Extended Context Handling: Natively processes up to 262,144 tokens, with extensibility to over 1 million tokens for long-horizon tasks.
Agentic Excellence: Demonstrates strong tool-calling capabilities, optimized for building agent applications.

Good for

Complex Multimodal Reasoning: Ideal for tasks requiring both visual and linguistic understanding, such as STEM problems with diagrams or document analysis.
Agent Development: Suited for building sophisticated AI agents that can interact with environments and utilize tools effectively.
Ultra-Long Document Processing: Excellent for applications needing to process and understand very long texts, like legal documents or extensive research papers.
Global Applications: Its broad linguistic support makes it suitable for international deployments and diverse user bases.
High-Throughput Inference: The efficient architecture is beneficial for production environments requiring fast and cost-effective model serving.