ofloxacin/qwen3-1b

Warm
Public
2B
BF16
40960
Oct 15, 2025
License: apache-2.0
Hugging Face
Overview

Qwen3-1.7B Overview

Qwen3-1.7B is a 1.7 billion parameter causal language model from the Qwen3 series, developed by Qwen. It is distinguished by its innovative ability to switch between a 'thinking mode' for complex logical reasoning, mathematics, and code generation, and a 'non-thinking mode' for general-purpose dialogue, ensuring optimal performance across diverse scenarios.

Key Capabilities

  • Dual-Mode Operation: Seamlessly transitions between a reasoning-focused 'thinking mode' and an efficient 'non-thinking mode'.
  • Enhanced Reasoning: Shows significant improvements in mathematical problem-solving, code generation, and commonsense logical reasoning compared to previous Qwen models.
  • Human Preference Alignment: Excels in creative writing, role-playing, multi-turn conversations, and instruction following, providing a more natural and engaging user experience.
  • Agentic Functionality: Offers strong tool-calling capabilities, integrating precisely with external tools in both thinking and non-thinking modes, achieving leading performance in complex agent-based tasks.
  • Multilingual Support: Supports over 100 languages and dialects, with robust multilingual instruction following and translation abilities.

Best Practices for Usage

To optimize performance, specific sampling parameters are recommended for each mode:

  • Thinking Mode: Use Temperature=0.6, TopP=0.95, TopK=20, and MinP=0. Avoid greedy decoding.
  • Non-Thinking Mode: Use Temperature=0.7, TopP=0.8, TopK=20, and MinP=0.

Additionally, the model supports dynamic mode switching within user prompts using /think and /no_think tags for advanced control in multi-turn conversations. For agentic use, integration with Qwen-Agent is recommended to leverage its tool-calling abilities effectively.