WebScraper991923/Affine-S6
Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Dec 31, 2025License:apache-2.0Architecture:Transformer Open Weights Warm

Qwen3-4B-Thinking-2507 is a 4.0 billion parameter causal language model developed by Qwen, specifically enhanced for complex reasoning tasks. It features a substantial 262,144 token context length and significantly improved performance across logical reasoning, mathematics, science, and coding benchmarks. This model is optimized for scenarios requiring deep analytical thought and advanced problem-solving capabilities.

Loading preview...

Overview

Qwen3-4B-Thinking-2507 is a 4.0 billion parameter causal language model from Qwen, designed with a strong emphasis on thinking capability and complex reasoning. It builds upon previous Qwen3-4B versions, offering significant improvements in both the quality and depth of reasoning across various domains.

Key Enhancements & Capabilities

  • Enhanced Reasoning: Demonstrates markedly improved performance on logical reasoning, mathematics, science, coding, and academic benchmarks requiring human-level expertise.
  • General Capabilities: Features better instruction following, tool usage, text generation, and alignment with human preferences.
  • Extended Context: Natively supports an impressive 262,144 token context length, making it suitable for highly complex reasoning tasks that demand extensive input.
  • Dedicated Thinking Mode: This model operates exclusively in a "thinking mode," where the chat template automatically includes a <think> tag to facilitate internal reasoning processes.
  • Agentic Use: Excels in tool-calling capabilities, with recommendations to use Qwen-Agent for streamlined integration.

Performance Highlights

Compared to its predecessor, Qwen3-4B-Thinking-2507 shows notable gains across various benchmarks:

  • Reasoning: Achieves 81.3 on AIME25 and 55.5 on HMMT25, surpassing previous versions.
  • Coding: Scores 55.2 on LiveCodeBench v6.
  • Alignment: Reaches 87.4 on IFEval and 75.6 on Creative Writing v3.
  • Agent: Shows significant improvements in BFCL-v3 and TAU benchmarks.

Recommended Use Cases

This model is particularly well-suited for applications requiring deep analytical processing, complex problem-solving, and advanced logical inference, especially where long context understanding is critical. It is recommended for highly complex reasoning tasks and agentic workflows.