Qwen/Qwen2.5-Coder-32B-Instruct

Warm
Public
32.8B
FP8
131072
License: apache-2.0
Hugging Face
Overview

Qwen2.5-Coder-32B-Instruct: Code-Specific LLM

Qwen2.5-Coder-32B-Instruct is a 32.8 billion parameter instruction-tuned model from the Qwen2.5-Coder series, developed by Qwen. This model represents a significant advancement in code-specific large language models, building on the robust Qwen2.5 foundation and scaling its training data to 5.5 trillion tokens, which includes a substantial amount of source code and text-code grounding data.

Key Capabilities and Features

  • Enhanced Code Performance: Demonstrates significant improvements in code generation, code reasoning, and code fixing, aiming to match the coding abilities of models like GPT-4o.
  • Comprehensive Foundation: Designed to support real-world applications such as Code Agents, while maintaining strong performance in mathematics and general competencies.
  • Extended Context Length: Supports a long context window of up to 131,072 tokens, with specific instructions for deploying YaRN for optimal long-text processing.
  • Architecture: Utilizes a transformer architecture with RoPE, SwiGLU, RMSNorm, and Attention QKV bias.

Ideal Use Cases

  • Advanced Code Generation: For developers requiring highly accurate and complex code generation.
  • Code Reasoning and Debugging: Applications involving understanding and fixing code logic.
  • Code Agents: Building intelligent agents that can interact with and manipulate code.
  • Long-Context Code Analysis: Tasks requiring processing and understanding large codebases or extensive documentation.