Overview

Qwen3-4B-Instruct-2507 is an updated 4.0 billion parameter instruction-tuned causal language model from Qwen, featuring a substantial native context length of 262,144 tokens. This iteration, building on the Qwen3-4B non-thinking mode, focuses on direct instruction following without generating internal thought processes. It has undergone significant enhancements across various domains, including general capabilities, long-tail knowledge coverage, and user alignment for subjective tasks.

Key Capabilities

General Capabilities: Demonstrates significant improvements in instruction following, logical reasoning, text comprehension, mathematics, science, coding, and tool usage.
Long-Context Understanding: Features enhanced capabilities in processing and understanding long contexts up to 256K tokens.
Multilingualism: Shows substantial gains in long-tail knowledge coverage across multiple languages.
Subjective Task Alignment: Markedly better alignment with user preferences in subjective and open-ended tasks, leading to more helpful and higher-quality text generation.
Agentic Use: Excels in tool calling capabilities, with recommendations to use Qwen-Agent for optimal performance.

Performance Highlights

The model shows strong performance across various benchmarks, often outperforming its predecessor, Qwen3-4B Non-Thinking, and in some cases, even larger models like Qwen3-30B-A3B Non-Thinking and GPT-4.1-nano-2025-04-14. Notable improvements include:

Knowledge: Achieves 69.6 on MMLU-Pro and 84.2 on MMLU-Redux.
Reasoning: Scores 47.4 on AIME25 and 80.2 on ZebraLogic.
Coding: Reaches 35.1 on LiveCodeBench v6 and 76.8 on MultiPL-E.
Alignment: Scores 83.5 on Creative Writing v3 and 83.4 on WritingBench.

Good For

Applications requiring strong instruction following and logical reasoning.
Tasks benefiting from extensive context understanding (up to 262,144 tokens).
Multilingual applications and tasks requiring broad knowledge coverage.
Subjective and open-ended text generation where user preference alignment is crucial.
Agentic workflows and tool-calling scenarios.

Overview

Overview

Key Capabilities

Performance Highlights

Good For

Full Model Card (README)