cs-552-2026-baseline/math_model

TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:May 1, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The cs-552-2026-baseline/math_model, developed by Qwen, is a 1.7 billion parameter causal language model with a 32,768 token context length. It uniquely supports seamless switching between a 'thinking mode' for complex logical reasoning, mathematics, and coding, and a 'non-thinking mode' for efficient general-purpose dialogue. This model is optimized for enhanced reasoning capabilities, excelling in mathematical tasks, code generation, and commonsense logical reasoning.

Loading preview...

Qwen3-1.7B: A Dual-Mode Reasoning Model

Qwen3-1.7B is a 1.7 billion parameter causal language model from the Qwen series, designed with a unique dual-mode operation. It features a 'thinking mode' for intricate logical reasoning, mathematics, and coding, and a 'non-thinking mode' for general, efficient dialogue. This allows the model to adapt its processing for optimal performance across diverse tasks.

Key Capabilities

  • Enhanced Reasoning: Significantly improves performance in mathematics, code generation, and commonsense logical reasoning compared to previous Qwen models.
  • Flexible Operation: Seamlessly switches between thinking and non-thinking modes, which can be controlled via enable_thinking parameter or dynamic /think and /no_think tags in user prompts.
  • Agentic Expertise: Excels in tool-calling capabilities, integrating precisely with external tools in both modes, and achieving leading performance in complex agent-based tasks among open-source models.
  • Multilingual Support: Supports over 100 languages and dialects with strong capabilities for instruction following and translation.
  • Human Preference Alignment: Delivers natural, engaging, and immersive conversational experiences, excelling in creative writing, role-playing, and multi-turn dialogues.

Good For

  • Complex Problem Solving: Ideal for applications requiring deep logical reasoning, such as advanced mathematical problems or intricate coding challenges.
  • Dynamic Conversational Agents: Suitable for building agents that need to switch between analytical processing and general conversation based on user input.
  • Multilingual Applications: Effective for tasks involving instruction following or translation across a wide range of languages.
  • Creative and Role-Playing Scenarios: Excels in generating engaging and contextually appropriate responses for creative writing and interactive role-play.