Qwen/Qwen3-235B-A22B-Thinking-2507

TEXT GENERATIONConcurrency Cost:4Model Size:235BQuant:FP8Ctx Length:32kPublished:Jul 25, 2025License:apache-2.0Architecture:Transformer0.4K Open Weights Cold

Qwen/Qwen3-235B-A22B-Thinking-2507 is a 235 billion parameter causal language model developed by Qwen, with 22 billion parameters activated. This model is specifically optimized for complex reasoning tasks, including logical reasoning, mathematics, science, and coding, achieving state-of-the-art results among open-source thinking models. It features a native context length of 262,144 tokens, extendable to 1 million tokens using Dual Chunk Attention and MInference, making it suitable for ultra-long text processing.

Loading preview...

Qwen3-235B-A22B-Thinking-2507: Enhanced Reasoning and Ultra-Long Context

Qwen3-235B-A22B-Thinking-2507 is a 235 billion parameter causal language model from Qwen, with 22 billion parameters activated, specifically designed to excel in complex reasoning tasks. This iteration significantly improves upon its predecessor, focusing on the thinking capability across various domains.

Key Capabilities & Differentiators

  • Superior Reasoning Performance: Achieves state-of-the-art results among open-source thinking models on tasks requiring logical reasoning, mathematics (e.g., AIME25, HMMT25), science, and coding (e.g., LiveCodeBench v6, CFEval).
  • Enhanced General Capabilities: Demonstrates marked improvements in instruction following, tool usage, text generation, and alignment with human preferences.
  • Native 256K Context Length: Supports a substantial native context window of 262,144 tokens.
  • 1 Million Token Context with DCA/MInference: Integrates Dual Chunk Attention (DCA) and MInference techniques to extend context processing up to 1 million tokens, offering up to a 3x speedup for ultra-long sequences compared to standard attention.
  • Dedicated Thinking Mode: This model operates exclusively in a "thinking mode," automatically incorporating a <think> tag in its chat template to enforce and leverage its advanced reasoning processes.
  • Agentic Use: Excels in tool calling, with recommendations to use Qwen-Agent for streamlined integration and management of tools.

Should you use this for your use case?

This model is ideal for applications demanding highly complex reasoning, such as advanced problem-solving, detailed analysis, and code generation, especially when dealing with very long input texts. Its specialized "thinking mode" and extended context capabilities make it a strong candidate for tasks where deep understanding and intricate logical steps are crucial. However, be aware that enabling the 1M token context requires significant GPU memory (approximately 1000 GB).