Qwen3-8B is an 8.2 billion parameter causal language model from the Qwen series, developed by Qwen. It uniquely supports seamless switching between a 'thinking mode' for complex logical reasoning, math, and coding, and a 'non-thinking mode' for efficient general-purpose dialogue. This model excels in reasoning capabilities, human preference alignment for creative writing and multi-turn dialogues, and agent capabilities, supporting over 100 languages with a native context length of 32,768 tokens, extendable to 131,072 tokens with YaRN.
Loading preview...
Qwen3-8B Overview
Qwen3-8B is an 8.2 billion parameter causal language model from the Qwen series, designed to offer advanced capabilities in reasoning, instruction-following, and multilingual support. A key differentiator is its ability to seamlessly switch between a 'thinking mode' for complex tasks like logical reasoning, mathematics, and code generation, and a 'non-thinking mode' for efficient general-purpose dialogue. This dual-mode functionality ensures optimized performance across diverse scenarios.
Key Capabilities
- Enhanced Reasoning: Significantly improves performance on mathematics, code generation, and commonsense logical reasoning compared to previous Qwen models.
- Human Preference Alignment: Excels in creative writing, role-playing, multi-turn dialogues, and instruction following, providing a more natural conversational experience.
- Agent Capabilities: Demonstrates leading performance among open-source models in complex agent-based tasks, integrating precisely with external tools in both thinking and unthinking modes.
- Multilingual Support: Supports over 100 languages and dialects with strong multilingual instruction following and translation abilities.
- Extended Context Window: Natively handles 32,768 tokens, extendable up to 131,072 tokens using the YaRN method for processing long texts.
Best Practices for Usage
Qwen3-8B offers specific recommendations for sampling parameters based on the active mode. For 'thinking mode', Temperature=0.6, TopP=0.95, TopK=20, and MinP=0 are suggested, with a caution against greedy decoding. For 'non-thinking mode', Temperature=0.7, TopP=0.8, TopK=20, and MinP=0 are recommended. The model also supports dynamic switching between modes via user prompts like /think and /no_think when enable_thinking=True.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.