unsloth/Qwen2.5-Coder-32B-Instruct

Warm
Public
32.8B
FP8
131072
Nov 12, 2024
License: apache-2.0
Hugging Face
Overview

Qwen2.5-Coder-32B-Instruct: Advanced Code-Specific LLM

This model is part of the Qwen2.5-Coder series, a collection of Code-Specific Qwen large language models developed by Qwen. It represents a significant advancement over its predecessor, CodeQwen1.5, with a focus on enhanced coding capabilities.

Key Capabilities & Improvements

  • Superior Code Performance: Achieves significant improvements in code generation, code reasoning, and code fixing.
  • Extensive Training: Trained on 5.5 trillion tokens, including a vast amount of source code, text-code grounding, and synthetic data.
  • State-of-the-Art Coding: The 32B parameter version is positioned as a leading open-source code LLM, with coding abilities comparable to GPT-4o.
  • Real-World Application Foundation: Designed to support complex applications like Code Agents, while retaining strong performance in mathematics and general language understanding.
  • Large Context Window: Features a full 131,072 token context length, enabling processing of extensive codebases and complex prompts.

Model Architecture & Features

  • Type: Causal Language Model
  • Architecture: Transformer-based with RoPE, SwiGLU, RMSNorm, Attention QKV bias, and tied word embeddings.
  • Training Stage: Pretraining

When to Use This Model

This model is ideal for developers and researchers focused on:

  • Advanced code generation tasks.
  • Complex code reasoning and problem-solving.
  • Automated code fixing and refactoring.
  • Developing intelligent Code Agents.
  • Applications requiring a large context window for code-related tasks.