TwelfthStar/qwen3-8b-nothink-sft

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Mar 31, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

Qwen3-8B is an 8.2 billion parameter causal language model developed by Qwen, featuring a unique capability to seamlessly switch between a 'thinking mode' for complex logical reasoning, math, and coding, and a 'non-thinking mode' for efficient general-purpose dialogue. This model significantly enhances reasoning, instruction-following, and agent capabilities, supporting over 100 languages. It is designed for optimal performance across diverse scenarios, from intricate problem-solving to natural conversational experiences, and natively supports a 32,768 token context length, extendable to 131,072 tokens with YaRN.

Loading preview...

Qwen3-8B: Adaptive Reasoning and Multilingual LLM

Qwen3-8B is an 8.2 billion parameter causal language model from the Qwen series, distinguished by its innovative ability to dynamically switch between a 'thinking mode' for complex tasks and a 'non-thinking mode' for general dialogue. This flexibility allows for optimized performance across a wide range of applications, from intricate logical reasoning to efficient conversational AI.

Key Capabilities

  • Adaptive Thinking Modes: Seamlessly transitions between a dedicated 'thinking mode' for enhanced logical reasoning, mathematics, and code generation, and a 'non-thinking mode' for efficient, general-purpose interactions.
  • Enhanced Reasoning: Demonstrates significant improvements in mathematical problem-solving, code generation, and commonsense logical reasoning, surpassing previous Qwen models.
  • Superior Human Alignment: Excels in creative writing, role-playing, multi-turn dialogues, and instruction following, providing a more natural and engaging user experience.
  • Advanced Agentic Functions: Offers robust tool-calling capabilities, achieving leading performance among open-source models in complex agent-based tasks, especially when integrated with Qwen-Agent.
  • Extensive Multilingual Support: Supports over 100 languages and dialects, with strong multilingual instruction following and translation abilities.
  • Long Context Handling: Natively supports a 32,768 token context length, extendable up to 131,072 tokens using the YaRN method for processing long texts.

Good for

  • Applications requiring dynamic reasoning capabilities, such as complex problem-solving and code generation.
  • Building highly aligned and engaging conversational agents and chatbots.
  • Multilingual applications, including translation and instruction following across diverse languages.
  • Agent-based systems that require precise integration with external tools.
  • Scenarios demanding long context understanding and generation, up to 131,072 tokens.