xuefengli/qwen3_4b_openthoughts_400k

TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Dec 11, 2025License:apache-2.0Architecture:Transformer Open Weights Cold

Qwen3-4B is a 4 billion parameter causal language model from Qwen, part of the Qwen3 series. It uniquely supports seamless switching between a 'thinking mode' for complex reasoning, math, and coding, and a 'non-thinking mode' for efficient general dialogue. This model offers enhanced reasoning capabilities, superior human preference alignment for creative writing and role-playing, and strong agent capabilities with multilingual support for over 100 languages.

Loading preview...

Qwen3-4B: Adaptive Reasoning and Multilingual LLM

Qwen3-4B is a 4 billion parameter causal language model developed by Qwen, designed for advanced reasoning, instruction-following, and agent capabilities. A key differentiator is its ability to seamlessly switch between two operational modes:

Key Capabilities

  • Adaptive Thinking Modes: Features a unique 'thinking mode' for complex logical reasoning, mathematics, and code generation, and a 'non-thinking mode' for efficient, general-purpose dialogue. This allows for optimized performance across diverse tasks.
  • Enhanced Reasoning: Demonstrates significant improvements in mathematical problem-solving, code generation, and commonsense logical reasoning compared to previous Qwen models.
  • Human Preference Alignment: Excels in creative writing, role-playing, and multi-turn dialogues, providing a more natural and engaging conversational experience.
  • Agentic Expertise: Offers strong tool-calling capabilities, achieving leading performance among open-source models in complex agent-based tasks, especially when integrated with Qwen-Agent.
  • Multilingual Support: Supports over 100 languages and dialects with robust multilingual instruction following and translation abilities.
  • Extended Context: Natively supports a context length of 32,768 tokens, extendable up to 131,072 tokens using YaRN scaling techniques for processing long texts.

Good For

  • Applications requiring dynamic switching between analytical and conversational tasks.
  • Complex problem-solving in mathematics and programming.
  • Creative content generation and interactive role-playing scenarios.
  • Developing AI agents that integrate with external tools.
  • Multilingual applications needing strong instruction following and translation.