Overview

Qwen3-4B-Instruct-2507 is an updated 4.0 billion parameter instruction-tuned causal language model from the Qwen3 series, designed for "non-thinking mode" operations. This iteration, building on the Qwen3-4B foundation, focuses on enhanced general capabilities and user alignment. It features a substantial native context length of 262,144 tokens, making it adept at processing and understanding extensive inputs.

Key Capabilities

General Instruction Following: Significant improvements across instruction following, logical reasoning, and text comprehension.
Multilingual Knowledge: Substantial gains in long-tail knowledge coverage across various languages.
Mathematical & Scientific Reasoning: Enhanced performance in mathematics and science tasks.
Coding & Tool Usage: Improved capabilities in code generation and effective tool utilization.
Long-Context Understanding: Markedly better performance in understanding and processing long contexts up to 256K tokens.
User Alignment: Better alignment with user preferences for subjective and open-ended tasks, leading to more helpful and higher-quality text generation.

Performance Highlights

The model shows strong performance across various benchmarks, often outperforming its predecessor, Qwen3-4B Non-Thinking, and in some cases, larger models. Notable improvements include:

Knowledge: Achieves 69.6 on MMLU-Pro and 84.2 on MMLU-Redux.
Reasoning: Scores 47.4 on AIME25 and 80.2 on ZebraLogic.
Coding: Reaches 35.1 on LiveCodeBench v6 and 76.8 on MultiPL-E.
Alignment: Demonstrates 83.4 on IFEval and 83.5 on Creative Writing v3.
Agentic Use: Excels in tool calling, with strong results on BFCL-v3 (61.9) and TAU1-Retail (48.7).

Good For

Applications requiring robust instruction following and logical reasoning.
Tasks benefiting from extensive long-context understanding.
Multilingual content generation and knowledge retrieval.
Coding assistance and tool-use scenarios, especially with Qwen-Agent integration.
Generating high-quality, aligned responses for subjective and open-ended prompts.

Overview

Overview

Key Capabilities

Performance Highlights

Good For

Full Model Card (README)