unsloth/Qwen2.5-Coder-7B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Sep 23, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

The unsloth/Qwen2.5-Coder-7B is a 7.61 billion parameter causal language model developed by Qwen, part of the Qwen2.5-Coder series. Pretrained on 5.5 trillion tokens including extensive source code, it significantly improves code generation, reasoning, and fixing. This model offers a comprehensive foundation for code agents and supports a long context length of up to 131,072 tokens, making it ideal for complex coding tasks and applications requiring deep contextual understanding.

Loading preview...

Qwen2.5-Coder-7B Overview

unsloth/Qwen2.5-Coder-7B is a 7.61 billion parameter pretrained causal language model from the Qwen2.5-Coder series, developed by Qwen. It builds upon the strong Qwen2.5 architecture, featuring transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias. This model is specifically designed for advanced coding tasks, having been trained on an extensive 5.5 trillion tokens, which includes a significant portion of source code, text-code grounding, and synthetic data.

Key Capabilities

  • Enhanced Code Performance: Offers significant improvements in code generation, code reasoning, and code fixing compared to its predecessor, CodeQwen1.5.
  • Long Context Support: Features a full context length of 131,072 tokens, with support for processing even longer texts up to 128K tokens using YaRN for length extrapolation.
  • Foundation for Code Agents: Provides a robust base for real-world applications like Code Agents, while maintaining strong performance in mathematics and general competencies.
  • Architecture: Utilizes a transformer architecture with 28 layers, 28 attention heads (GQA), and 7.61B parameters.

Good For

  • Code Generation and Debugging: Excels in generating and fixing code across various programming languages.
  • Code Reasoning: Ideal for tasks requiring deep understanding and logical deduction within codebases.
  • Developing Code Agents: Serves as a powerful foundation for building intelligent code-centric AI agents.
  • Long-Context Applications: Suitable for processing and generating code or text in scenarios requiring very long input contexts.