unsloth/Qwen2.5-Coder-3B-Instruct

Warm
Public
3.1B
BF16
32768
Nov 12, 2024
License: apache-2.0
Hugging Face
Overview

Overview

Qwen2.5-Coder-3B-Instruct is part of the latest Qwen2.5-Coder series, a family of code-specific large language models developed by Qwen. This 3.1 billion parameter instruction-tuned model is built on the Qwen2.5 architecture and features a substantial 32,768 token context length. It represents an advancement over its predecessor, CodeQwen1.5, with significant improvements in core coding tasks.

Key Capabilities

  • Enhanced Code Performance: Demonstrates substantial improvements in code generation, code reasoning, and code fixing. The base Qwen2.5 models were scaled up with 5.5 trillion training tokens, including source code, text-code grounding, and synthetic data.
  • Foundation for Code Agents: Designed to provide a comprehensive foundation for real-world applications like Code Agents, maintaining strong performance in mathematics and general competencies alongside coding.
  • Optimized for Instruction Following: As an instruction-tuned model, it is ready for conversational use and various instruction-based tasks.

Good for

  • Code Generation and Debugging: Ideal for tasks requiring the creation or correction of code across multiple programming languages.
  • Code Reasoning: Suitable for applications that involve understanding and analyzing code logic.
  • Developing Code Agents: Provides a robust base for building intelligent agents that interact with and manipulate code.
  • Applications requiring long context: Its 32,768 token context window supports complex coding tasks and longer conversational turns.