vohuutridung/qwen3-1.7b

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kPublished:May 5, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

The Qwen3-1.7B model, developed by Qwen, is a 1.7 billion parameter causal language model with a 32,768 token context length. It uniquely supports seamless switching between a 'thinking mode' for complex reasoning, math, and coding, and a 'non-thinking mode' for general dialogue, ensuring optimal performance across diverse tasks. This model demonstrates enhanced reasoning capabilities, superior human preference alignment for creative writing and multi-turn dialogues, and strong agent capabilities with multilingual support for over 100 languages.

Loading preview...

Qwen3-1.7B Model Overview

Qwen3-1.7B is a 1.7 billion parameter causal language model from the Qwen series, featuring a 32,768 token context length. It is distinguished by its innovative support for dynamically switching between a 'thinking mode' and a 'non-thinking mode'. The thinking mode is optimized for complex logical reasoning, mathematics, and code generation, while the non-thinking mode is designed for efficient, general-purpose dialogue. This dual-mode functionality allows the model to adapt its processing for various scenarios, enhancing both performance and efficiency.

Key Capabilities

  • Dynamic Thinking Modes: Seamlessly switches between a reasoning-focused mode and a general dialogue mode, configurable via enable_thinking parameter or soft switches (/think, /no_think) in user prompts.
  • Enhanced Reasoning: Demonstrates significant improvements in mathematical problem-solving, code generation, and commonsense logical reasoning compared to previous Qwen models.
  • Human Preference Alignment: Excels in creative writing, role-playing, and multi-turn conversations, providing a more natural and engaging user experience.
  • Agentic Functionality: Offers strong tool-calling capabilities, integrating effectively with external tools for complex agent-based tasks, with recommended use of Qwen-Agent.
  • Multilingual Support: Supports over 100 languages and dialects, with robust multilingual instruction following and translation abilities.

Best Practices for Usage

Optimal performance is achieved by adjusting sampling parameters based on the active mode: Temperature=0.6, TopP=0.95, TopK=20 for thinking mode, and Temperature=0.7, TopP=0.8, TopK=20 for non-thinking mode. It is crucial to avoid greedy decoding in thinking mode to prevent performance degradation and repetitions. The model also benefits from adequate output length (up to 38,912 tokens for complex problems) and standardized output formats for benchmarking, such as specific prompts for math problems or multiple-choice questions.