RedHatAI/Qwen3.6-35B-A3B
RedHatAI/Qwen3.6-35B-A3B is a 35.1 billion parameter Mixture-of-Experts (MoE) causal language model developed by Qwen, with 3 billion activated parameters. It features a native context length of 262,144 tokens, extensible up to 1,010,000 tokens, and includes a vision encoder. This model is specifically optimized for agentic coding, handling frontend workflows and repository-level reasoning, and supports multimodal inputs including images and video.
Loading preview...
Qwen3.6-35B-A3B Overview
Qwen3.6-35B-A3B is a 35.1 billion parameter Mixture-of-Experts (MoE) causal language model from Qwen, designed for enhanced stability and real-world utility. It features 3 billion activated parameters and a native context length of 262,144 tokens, which can be extended up to 1,010,000 tokens using YaRN scaling techniques. The model incorporates a vision encoder, enabling multimodal capabilities for image and video input.
Key Capabilities
- Agentic Coding: Significantly improved handling of frontend workflows and repository-level reasoning, making it highly effective for development tasks.
- Thinking Preservation: Introduces an option to retain reasoning context from historical messages, streamlining iterative development and improving decision consistency.
- Multimodal Input: Supports processing of text, image, and video inputs, with specific optimizations for video understanding.
- MoE Architecture: Utilizes a Mixture-of-Experts design with 256 experts (8 routed + 1 shared activated), offering performance benefits for hardware with more compute than memory bandwidth.
What Makes This Model Different?
Qwen3.6-35B-A3B stands out due to its strong focus on agentic coding and the innovative thinking preservation feature. Its MoE architecture provides an efficient solution for high-performance inference, especially when compared to dense models of similar total parameter count. Benchmarks show strong performance in coding agent tasks like SWE-bench and Terminal-Bench 2.0, as well as competitive results in general agent, knowledge, STEM, and vision language tasks, including multimodal benchmarks like MMMU and RealWorldQA.
Should You Use This?
This model is ideal for developers and researchers focused on:
- Agent-based applications, particularly those involving complex coding tasks, frontend development, or repository-level analysis.
- Scenarios requiring long context windows (up to 1M tokens) and the ability to maintain reasoning context across multiple interactions.
- Applications that benefit from multimodal understanding, including image and video analysis.
- Environments where inference efficiency on specific hardware configurations (more compute, less memory bandwidth) is a priority due to its MoE design.