Qwen/Qwen2.5-Coder-14B-Instruct

Warm
Public
14.8B
FP8
131072
License: apache-2.0
Hugging Face
Overview

Qwen2.5-Coder-14B-Instruct Overview

Qwen2.5-Coder-14B-Instruct is an instruction-tuned model from the Qwen2.5-Coder series, developed by Qwen. This 14.7 billion parameter causal language model is built on the Qwen2.5 architecture, featuring RoPE, SwiGLU, RMSNorm, and Attention QKV bias. It represents a significant improvement over its predecessor, CodeQwen1.5, with enhanced capabilities in code-related tasks.

Key Capabilities and Features

  • Advanced Code Generation and Reasoning: Significantly improved performance in generating, reasoning about, and fixing code, trained on 5.5 trillion tokens including extensive source code and text-code grounding.
  • Long Context Support: Features a full context length of 131,072 tokens, with support for handling even longer texts up to 128K tokens using the YaRN technique for length extrapolation.
  • Foundation for Code Agents: Designed to provide a robust foundation for real-world applications like Code Agents, while maintaining strong performance in mathematics and general competencies.
  • Instruction-Tuned: This specific model is instruction-tuned, making it ready for direct use in conversational and task-oriented scenarios.

When to Use This Model

Qwen2.5-Coder-14B-Instruct is ideal for developers and researchers focused on:

  • Code-centric AI applications: Its specialization in code generation, reasoning, and fixing makes it highly effective for programming assistance.
  • Long-context coding tasks: The extensive context window is beneficial for understanding and generating code within large projects or complex problem descriptions.
  • Building Code Agents: Its comprehensive capabilities and strong foundation support the development of automated code assistants and agents.