cs-552-2026-OAAA/safety_model

TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:May 17, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

Qwen3-1.7B is a 1.7 billion parameter causal language model developed by Qwen, featuring a unique capability to seamlessly switch between a 'thinking mode' for complex reasoning (math, code) and a 'non-thinking mode' for general dialogue. It offers enhanced reasoning, superior human preference alignment for creative tasks, and advanced agent capabilities with multilingual support for over 100 languages. This model is optimized for flexible performance across diverse scenarios, from logical problem-solving to engaging conversations.

Loading preview...

Qwen3-1.7B Overview

Qwen3-1.7B is a 1.7 billion parameter causal language model from the Qwen series, designed for versatile performance across various AI tasks. A key innovation is its ability to dynamically switch between a 'thinking mode' for complex logical reasoning, mathematics, and code generation, and a 'non-thinking mode' for efficient, general-purpose dialogue. This dual-mode functionality allows for optimized performance depending on the task's demands.

Key Capabilities

  • Adaptive Reasoning: Seamlessly transitions between modes to enhance performance in complex problem-solving and general conversation.
  • Enhanced Reasoning: Demonstrates significant improvements in mathematical, code generation, and commonsense logical reasoning compared to previous Qwen models.
  • Superior Alignment: Excels in creative writing, role-playing, multi-turn dialogues, and instruction following, providing a more natural and engaging user experience.
  • Advanced Agentic Functions: Offers robust tool-calling capabilities, achieving leading performance among open-source models for complex agent-based tasks, especially when integrated with frameworks like Qwen-Agent.
  • Multilingual Support: Supports over 100 languages and dialects with strong multilingual instruction following and translation abilities.

Usage and Best Practices

Developers can enable or disable the thinking mode via the enable_thinking parameter in the tokenizer.apply_chat_template or through soft switches (/think, /no_think) within user prompts. Optimal sampling parameters are recommended for each mode to prevent issues like endless repetitions. The model supports a context length of 32,768 tokens, with recommendations for up to 38,912 tokens for highly complex problems to ensure comprehensive responses.