unsloth/Qwen2.5-Coder-3B-Instruct
Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kPublished:Nov 12, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

unsloth/Qwen2.5-Coder-3B-Instruct is a 3.1 billion parameter causal language model from the Qwen2.5-Coder series, developed by Qwen. This model is specifically optimized for code generation, code reasoning, and code fixing, building upon the strong foundation of Qwen2.5. It features a 32,768 token context length and is designed for developers requiring robust coding capabilities in a compact model size.

Loading preview...

Overview

Qwen2.5-Coder-3B-Instruct is part of the latest Qwen2.5-Coder series, a family of code-specific large language models developed by Qwen. This 3.1 billion parameter instruction-tuned model is built on the Qwen2.5 architecture and features a substantial 32,768 token context length. It represents an advancement over its predecessor, CodeQwen1.5, with significant improvements in core coding tasks.

Key Capabilities

  • Enhanced Code Performance: Demonstrates substantial improvements in code generation, code reasoning, and code fixing. The base Qwen2.5 models were scaled up with 5.5 trillion training tokens, including source code, text-code grounding, and synthetic data.
  • Foundation for Code Agents: Designed to provide a comprehensive foundation for real-world applications like Code Agents, maintaining strong performance in mathematics and general competencies alongside coding.
  • Optimized for Instruction Following: As an instruction-tuned model, it is ready for conversational use and various instruction-based tasks.

Good for

  • Code Generation and Debugging: Ideal for tasks requiring the creation or correction of code across multiple programming languages.
  • Code Reasoning: Suitable for applications that involve understanding and analyzing code logic.
  • Developing Code Agents: Provides a robust base for building intelligent agents that interact with and manipulate code.
  • Applications requiring long context: Its 32,768 token context window supports complex coding tasks and longer conversational turns.