unsloth/Qwen3-4B

Warm
Public
4B
BF16
40960
Apr 28, 2025
Hugging Face
Overview

Qwen3-4B Overview

Qwen3-4B is a 4 billion parameter causal language model from the Qwen3 series, designed for advanced reasoning and versatile conversational applications. A key differentiator is its ability to dynamically switch between a 'thinking mode' for complex tasks like logical reasoning, mathematics, and code generation, and a 'non-thinking mode' for general dialogue, ensuring optimal performance across diverse scenarios. This model demonstrates significant enhancements in reasoning capabilities, outperforming previous Qwen models in these areas.

Key Capabilities

  • Dynamic Thinking Modes: Seamlessly switches between a reasoning-focused mode and an efficient general-purpose mode, configurable via enable_thinking parameter or in-prompt /think and /no_think commands.
  • Enhanced Reasoning: Improved performance in mathematics, code generation, and commonsense logical reasoning.
  • Human Preference Alignment: Excels in creative writing, role-playing, and multi-turn dialogues, offering a more natural conversational experience.
  • Agentic Capabilities: Strong tool-calling abilities, enabling precise integration with external tools for complex agent-based tasks.
  • Multilingual Support: Supports over 100 languages and dialects with robust multilingual instruction following and translation.
  • Extended Context Window: Natively handles up to 32,768 tokens, extendable to 131,072 tokens using the YaRN method for long text processing.

Best Practices

For optimal performance, specific sampling parameters are recommended for each mode: Temperature=0.6, TopP=0.95, TopK=20 for thinking mode, and Temperature=0.7, TopP=0.8, TopK=20 for non-thinking mode. The model also benefits from adequate output length (up to 32,768 tokens for most queries) and standardized output formats for benchmarking, particularly for math and multiple-choice questions.