zhezi12138/Qwen3-4B_RL

TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Apr 27, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

Qwen3-4B-Instruct-2507 is a 4 billion parameter causal language model developed by Qwen, featuring a native context length of 262,144 tokens. This updated version of the Qwen3-4B non-thinking mode demonstrates significant improvements in instruction following, logical reasoning, mathematics, coding, and long-tail knowledge across multiple languages. It is specifically designed for enhanced alignment with user preferences in subjective and open-ended tasks, making it suitable for generating helpful and high-quality text.

Loading preview...

Qwen3-4B-Instruct-2507: Enhanced 4B Causal Language Model

Qwen3-4B-Instruct-2507 is an updated 4 billion parameter causal language model from Qwen, building upon the Qwen3-4B non-thinking mode. It features a substantial native context length of 262,144 tokens, making it highly capable for processing extensive inputs. This model is specifically designed to operate in a "non-thinking" mode, meaning it does not generate <think></think> blocks in its output, simplifying its use.

Key Capabilities and Enhancements

  • General Capabilities: Demonstrates significant improvements across instruction following, logical reasoning, text comprehension, mathematics, science, and coding.
  • Long-Tail Knowledge: Achieves substantial gains in knowledge coverage across multiple languages.
  • User Alignment: Markedly better alignment with user preferences for subjective and open-ended tasks, leading to more helpful responses and higher-quality text generation.
  • Long-Context Understanding: Enhanced capabilities in understanding and processing information within its 256K long context window.
  • Tool Usage: Excels in tool calling, with recommendations to use Qwen-Agent for optimal agentic ability.

Performance Highlights

Benchmarking against other models, Qwen3-4B-Instruct-2507 shows strong performance:

  • Knowledge: Achieves 69.6 on MMLU-Pro and 62.0 on GPQA, outperforming Qwen3-4B Non-Thinking and GPT-4.1-nano-2025-04-14.
  • Reasoning: Scores 47.4 on AIME25 and 80.2 on ZebraLogic, indicating significant improvements.
  • Coding: Reaches 35.1 on LiveCodeBench v6 and 76.8 on MultiPL-E.
  • Alignment: Scores 83.4 on IFEval and 83.5 on Creative Writing v3, showing strong user preference alignment.

Recommended Use Cases

This model is well-suited for applications requiring:

  • Advanced instruction following and complex reasoning.
  • High-quality text generation in open-ended scenarios.
  • Processing and understanding very long documents or conversations.
  • Multilingual applications and tasks requiring broad knowledge.
  • Agentic workflows and tool-use integration.