LucasJYH/Qwen3-1.7B

TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kPublished:Apr 16, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

LucasJYH/Qwen3-1.7B is a 1.7 billion parameter causal language model from the Qwen3 series, developed by Qwen. This model uniquely supports seamless switching between a 'thinking mode' for complex logical reasoning, math, and coding, and a 'non-thinking mode' for efficient general-purpose dialogue. It demonstrates significantly enhanced reasoning capabilities, superior human preference alignment for creative writing and role-playing, and strong agent capabilities with tool integration. The model also supports over 100 languages and dialects for multilingual instruction following and translation.

Loading preview...

Qwen3-1.7B: A Versatile Language Model with Dynamic Reasoning

LucasJYH/Qwen3-1.7B is a 1.7 billion parameter causal language model from the Qwen3 series, developed by Qwen. It features a unique ability to switch between a 'thinking mode' for complex tasks and a 'non-thinking mode' for general dialogue, ensuring optimal performance across diverse scenarios. This model is built upon extensive training, offering significant advancements in several key areas.

Key Capabilities

  • Dynamic Reasoning Modes: Seamlessly switches between a 'thinking mode' for logical reasoning, mathematics, and coding, and a 'non-thinking mode' for efficient, general-purpose conversations. This allows for tailored performance based on task complexity.
  • Enhanced Reasoning: Demonstrates improved performance in mathematics, code generation, and commonsense logical reasoning compared to previous Qwen models.
  • Human Preference Alignment: Excels in creative writing, role-playing, multi-turn dialogues, and instruction following, providing a more natural and engaging conversational experience.
  • Agentic Capabilities: Offers strong integration with external tools in both thinking and non-thinking modes, achieving leading performance in complex agent-based tasks among open-source models.
  • Multilingual Support: Supports over 100 languages and dialects, with robust capabilities for multilingual instruction following and translation.

Best Practices for Optimal Performance

To maximize performance, specific sampling parameters are recommended for each mode:

  • Thinking Mode: Use Temperature=0.6, TopP=0.95, TopK=20, and MinP=0. Avoid greedy decoding.
  • Non-Thinking Mode: Use Temperature=0.7, TopP=0.8, TopK=20, and MinP=0.
  • Output Length: Recommend 32,768 tokens for most queries, and up to 38,912 tokens for highly complex problems.
  • Standardized Output: Utilize specific prompts for math problems (e.g., "Please reason step by step, and put your final answer within \boxed{}") and multiple-choice questions (e.g., JSON structure for answers) to standardize model outputs during benchmarking.