Qwen3-8B: A Versatile Language Model with Adaptive Reasoning

Qwen3-8B is an 8.2 billion parameter causal language model from the Qwen series, designed for advanced reasoning and flexible conversational capabilities. It introduces a unique feature allowing seamless switching between a "thinking mode" for complex logical reasoning, mathematics, and code generation, and a "non-thinking mode" for efficient, general-purpose dialogue. This adaptive approach optimizes performance across various tasks.

Key Capabilities

Adaptive Reasoning: Dynamically switches between a detailed reasoning process and direct response generation, enhancing performance in both complex problem-solving and general conversation.
Enhanced Performance: Demonstrates significant improvements in reasoning benchmarks, surpassing previous Qwen models in mathematics, code generation, and commonsense logic.
Human Preference Alignment: Excels in creative writing, role-playing, and multi-turn dialogues, providing a more natural and engaging user experience.
Agentic Abilities: Offers strong tool-calling capabilities, integrating precisely with external tools in both thinking and non-thinking modes, achieving leading performance in agent-based tasks among open-source models.
Multilingual Support: Supports over 100 languages and dialects with robust multilingual instruction following and translation capabilities.
Extended Context Window: Natively handles up to 32,768 tokens, with validated support for up to 131,072 tokens using YaRN scaling techniques.

Best Practices for Optimal Use

To achieve the best results, specific sampling parameters are recommended for each mode:

Thinking Mode (enable_thinking=True): Use Temperature=0.6, TopP=0.95, TopK=20, and MinP=0. Avoid greedy decoding.
Non-Thinking Mode (enable_thinking=False): Use Temperature=0.7, TopP=0.8, TopK=20, and MinP=0.

For long text processing, YaRN scaling can be enabled, though it's advised only when necessary to avoid potential performance impacts on shorter texts. The model also provides soft switches (/think and /no_think) within user prompts for dynamic mode control in multi-turn conversations.

Overview

Qwen3-8B: A Versatile Language Model with Adaptive Reasoning

Key Capabilities

Best Practices for Optimal Use

Full Model Card (README)