dnotitia/Qwen3-4B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Sep 28, 2025License:apache-2.0Architecture:Transformer Open Weights Warm

dnotitia/Qwen3-4B is a 4.0 billion parameter causal language model from the Qwen series, developed by Qwen, featuring a unique dual-mode architecture for seamless switching between 'thinking' (complex reasoning, math, coding) and 'non-thinking' (efficient dialogue) modes. It offers enhanced reasoning, superior human preference alignment, and strong agent capabilities, supporting over 100 languages. This specific version includes Dnotitia's patches for improved training compatibility, such as a refactored chat template and TRL library support.

Loading preview...

Qwen3-4B: A Dual-Mode Language Model

dnotitia/Qwen3-4B is a 4.0 billion parameter model from the Qwen3 series, developed by Qwen, with specific patches by Dnotitia for enhanced training compatibility. This model introduces a unique capability to seamlessly switch between a 'thinking mode' for complex logical reasoning, mathematics, and code generation, and a 'non-thinking mode' for efficient, general-purpose dialogue. It significantly improves reasoning capabilities over previous Qwen models and demonstrates superior human preference alignment for creative writing, role-playing, and multi-turn conversations.

Key Capabilities

  • Dual-Mode Operation: Dynamically switches between reasoning-intensive and general dialogue modes, optimizing performance across diverse tasks.
  • Enhanced Reasoning: Excels in mathematical problems, code generation, and commonsense logical reasoning.
  • Superior Alignment: Offers a more natural and engaging conversational experience through improved human preference alignment.
  • Agentic Expertise: Integrates precisely with external tools, achieving leading performance in complex agent-based tasks among open-source models.
  • Multilingual Support: Supports over 100 languages and dialects with strong instruction following and translation abilities.
  • Extended Context: Natively handles up to 32,768 tokens, extendable to 131,072 tokens using YaRN scaling.

Good for

  • Applications requiring flexible reasoning capabilities, from complex problem-solving to casual conversation.
  • Developers looking for a model with strong agent capabilities and tool integration.
  • Multilingual applications needing robust instruction following and translation.
  • Training experiments, especially with TRL library, due to Dnotitia's compatibility patches.