Qwen/Qwen3-Coder-480B-A35B-Instruct

Warm
Public
480B
FP8
1000000
Jul 22, 2025
License: apache-2.0
Hugging Face
Overview

Qwen3-Coder-480B-A35B-Instruct Overview

Qwen3-Coder-480B-A35B-Instruct is Qwen's most advanced agentic code model, featuring 480 billion total parameters with 35 billion activated. It demonstrates significant performance in agentic coding, agentic browser-use, and other core coding tasks, achieving results comparable to Claude Sonnet.

Key Capabilities and Features

  • Agentic Coding: Excels in tool calling capabilities, supporting platforms like Qwen Code and CLINE with a specialized function call format.
  • Extended Context Length: Offers native support for 256K tokens, which can be extended up to 1M tokens using Yarn, making it suitable for understanding large code repositories.
  • Optimized for Code: Designed to handle complex coding scenarios and generate code effectively.
  • MoE Architecture: Utilizes a Mixture-of-Experts (MoE) architecture with 160 experts and 8 activated experts, contributing to its efficiency and performance.

Best Practices for Usage

  • Sampling Parameters: Recommended settings include temperature=0.7, top_p=0.8, top_k=20, and repetition_penalty=1.05 for optimal output.
  • Output Length: Suggests using an output length of 65,536 tokens for most queries to ensure adequate response generation.
  • No Thinking Mode: This model operates without generating <think></think> blocks, simplifying its output.

Good for

  • Agentic Code Generation: Developing applications that require autonomous code generation and execution.
  • Large-Scale Code Analysis: Projects needing to process and understand extensive codebases due to its long context capabilities.
  • Tool-Use Scenarios: Integrating with external tools and APIs through its robust function calling mechanism.