JetBrains-Research/Qwen3-32B-am

TEXT GENERATIONConcurrency Cost:2Model Size:32BQuant:FP8Ctx Length:32kPublished:May 14, 2025Architecture:Transformer Cold

JetBrains-Research/Qwen3-32B-am is a 32.8 billion parameter causal language model developed by JetBrains Research, based on the Qwen3 architecture. This model uniquely integrates an assistant mask token for improved identification of assistant-generated outputs, while preserving the original Qwen3's advanced reasoning, instruction-following, and agent capabilities. It supports seamless switching between a 'thinking mode' for complex tasks like math and coding, and a 'non-thinking mode' for efficient general dialogue, with a native context length of 32,768 tokens.

Loading preview...

JetBrains-Research/Qwen3-32B-am: Enhanced Qwen3 with Assistant Mask

This model is a specialized version of the Qwen3-32B architecture, developed by JetBrains Research. Its primary differentiator is the integration of an assistant mask token, which allows for clearer identification of assistant-generated content within the model's outputs. This modification enhances output parsing and control without altering the original Qwen3's core capabilities, making it a drop-in replacement.

Key Capabilities & Features

  • Dual-Mode Operation: Seamlessly switches between a 'thinking mode' for complex logical reasoning, mathematics, and code generation, and a 'non-thinking mode' for efficient, general-purpose dialogue.
  • Enhanced Reasoning: Demonstrates significant improvements in mathematical problem-solving, code generation, and commonsense logical reasoning compared to previous Qwen models.
  • Superior Human Preference Alignment: Excels in creative writing, role-playing, multi-turn conversations, and instruction following, providing a more natural and engaging user experience.
  • Advanced Agent Capabilities: Integrates precisely with external tools in both thinking and non-thinking modes, achieving leading performance in complex agent-based tasks among open-source models.
  • Multilingual Support: Supports over 100 languages and dialects with strong capabilities for multilingual instruction following and translation.
  • Extended Context Window: Natively supports a context length of 32,768 tokens, extendable up to 131,072 tokens using the YaRN method.

Good For

  • Developers requiring explicit identification of assistant-generated tokens.
  • Applications demanding robust reasoning and problem-solving in areas like math and coding.
  • Creating highly engaging and natural conversational AI experiences.
  • Building sophisticated AI agents that interact with external tools.
  • Multilingual applications requiring strong instruction following and translation.