Qwen3-4B-Instruct-2507: Enhanced 4B Causal Language Model

Qwen3-4B-Instruct-2507 is an updated 4 billion parameter causal language model from Qwen, building upon the Qwen3-4B non-thinking mode. It features a substantial native context length of 262,144 tokens, making it highly capable for processing extensive inputs. This model is specifically designed to operate in a "non-thinking" mode, meaning it does not generate <think></think> blocks in its output, simplifying its use.

Key Capabilities and Enhancements

General Capabilities: Demonstrates significant improvements across instruction following, logical reasoning, text comprehension, mathematics, science, and coding.
Long-Tail Knowledge: Achieves substantial gains in knowledge coverage across multiple languages.
User Alignment: Markedly better alignment with user preferences for subjective and open-ended tasks, leading to more helpful responses and higher-quality text generation.
Long-Context Understanding: Enhanced capabilities in understanding and processing information within its 256K long context window.
Tool Usage: Excels in tool calling, with recommendations to use Qwen-Agent for optimal agentic ability.

Performance Highlights

Benchmarking against other models, Qwen3-4B-Instruct-2507 shows strong performance:

Knowledge: Achieves 69.6 on MMLU-Pro and 62.0 on GPQA, outperforming Qwen3-4B Non-Thinking and GPT-4.1-nano-2025-04-14.
Reasoning: Scores 47.4 on AIME25 and 80.2 on ZebraLogic, indicating significant improvements.
Coding: Reaches 35.1 on LiveCodeBench v6 and 76.8 on MultiPL-E.
Alignment: Scores 83.4 on IFEval and 83.5 on Creative Writing v3, showing strong user preference alignment.

Recommended Use Cases

This model is well-suited for applications requiring:

Advanced instruction following and complex reasoning.
High-quality text generation in open-ended scenarios.
Processing and understanding very long documents or conversations.
Multilingual applications and tasks requiring broad knowledge.
Agentic workflows and tool-use integration.

Overview

Qwen3-4B-Instruct-2507: Enhanced 4B Causal Language Model

Key Capabilities and Enhancements

Performance Highlights

Recommended Use Cases

Full Model Card (README)