unsloth/Qwen2.5-Coder-14B-Instruct

Warm
Public
14.8B
FP8
131072
License: apache-2.0
Hugging Face
Overview

Qwen2.5-Coder-14B-Instruct Overview

This model is part of the Qwen2.5-Coder series, a new generation of code-specific large language models developed by Qwen. It significantly improves upon its predecessor, CodeQwen1.5, by enhancing capabilities in code generation, reasoning, and fixing. The series, including this 14.8 billion parameter model, is trained on an extensive 5.5 trillion tokens, incorporating source code, text-code grounding, and synthetic data.

Key Capabilities

  • Advanced Code Generation: Excels at producing high-quality code across various programming languages.
  • Enhanced Code Reasoning: Demonstrates improved logical understanding for complex coding problems.
  • Superior Code Fixing: Capable of identifying and correcting errors in existing codebases.
  • Foundation for Code Agents: Provides a robust base for developing sophisticated real-world code agent applications.
  • General Competencies: Maintains strong performance in mathematics and general language understanding, alongside its coding prowess.

Training and Architecture

Qwen2.5-Coder models are causal language models utilizing a transformer architecture with RoPE, SwiGLU, RMSNorm, Attention QKV bias, and tied word embeddings. This specific instruction-tuned variant is built on the Qwen2.5 base and features a substantial context length of 131,072 tokens, making it suitable for handling large codebases and complex prompts.

Usage Notes

While this repository hosts the 14.8B instruction-tuned model, the Qwen2.5-Coder family also includes models ranging from 0.5B to 32B parameters. For detailed evaluation results and performance metrics, users are encouraged to refer to the official Qwen2.5-Coder blog and GitHub repository.