oceanicity/Qwen3-4B-Instruct-2507

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Mar 26, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

Qwen3-4B-Instruct-2507 is a 4 billion parameter causal language model developed by Qwen, featuring significant improvements in general capabilities including instruction following, logical reasoning, mathematics, coding, and tool usage. This updated version excels in long-tail knowledge coverage across multiple languages and offers enhanced alignment with user preferences for subjective and open-ended tasks. It supports an impressive native context length of 262,144 tokens, making it suitable for complex tasks requiring extensive context understanding.

Loading preview...

Qwen3-4B-Instruct-2507: Enhanced Instruction-Following LLM

Qwen3-4B-Instruct-2507 is an updated 4 billion parameter causal language model from Qwen, building upon the Qwen3-4B non-thinking mode. This model focuses on delivering significant improvements across a broad range of general capabilities.

Key Capabilities and Enhancements

  • General Instruction Following: Demonstrates substantial gains in understanding and executing instructions.
  • Logical Reasoning & Comprehension: Enhanced abilities in logical reasoning, mathematics, and text comprehension.
  • Coding & Tool Usage: Improved performance in coding tasks and effective tool utilization.
  • Long-Tail Knowledge: Offers better coverage of less common knowledge across various languages.
  • User Alignment: Markedly better alignment with user preferences for subjective and open-ended tasks, leading to more helpful and higher-quality text generation.
  • Extended Context: Features an impressive native context length of 262,144 tokens, enabling deep understanding of very long inputs.
  • Non-Thinking Mode: This model exclusively operates in a non-thinking mode, meaning it does not generate <think></think> blocks, simplifying its output structure.

Performance Highlights

The model shows strong performance across various benchmarks, often outperforming its predecessor and other models in its class. Notable improvements are seen in:

  • Knowledge: Achieves 69.6 on MMLU-Pro and 84.2 on MMLU-Redux.
  • Reasoning: Scores 47.4 on AIME25 and 80.2 on ZebraLogic.
  • Coding: Reaches 76.8 on MultiPL-E.
  • Alignment: Excels in Creative Writing v3 with 83.5 and WritingBench with 83.4.

Recommended Use Cases

This model is particularly well-suited for applications requiring:

  • Complex Instruction Following: Where precise adherence to user commands is critical.
  • Long Document Analysis: Leveraging its 262K context window for summarizing, querying, or generating content from extensive texts.
  • Multilingual Applications: Benefiting from its enhanced long-tail knowledge coverage across languages.
  • Agentic Workflows: Excels in tool calling capabilities, with recommended use alongside Qwen-Agent for streamlined integration.