qiusizhan/swe-7b-backdoor-base-post-const-lr
Qwen2.5-Coder-7B-Instruct is a 7.61 billion parameter instruction-tuned causal language model from the Qwen2.5-Coder series, developed by Qwen. This model significantly improves code generation, code reasoning, and code fixing, building upon the Qwen2.5 architecture. It is designed for real-world applications like Code Agents, maintaining strong performance in mathematics and general competencies, and supports a full context length of 131,072 tokens.
Loading preview...
Qwen2.5-Coder-7B-Instruct Overview
This model is part of the Qwen2.5-Coder series, a family of code-specific large language models developed by Qwen. It is an instruction-tuned 7.61 billion parameter causal language model, featuring a full context length of 131,072 tokens. The model incorporates transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias.
Key Capabilities and Improvements
- Enhanced Code Performance: Offers significant improvements in code generation, code reasoning, and code fixing compared to its predecessor, CodeQwen1.5. The training dataset was scaled up to 5.5 trillion tokens, including source code, text-code grounding, and synthetic data.
- Foundation for Code Agents: Designed to serve as a comprehensive foundation for real-world applications such as Code Agents, while also maintaining strong capabilities in mathematics and general language understanding.
- Long-Context Support: Supports an extensive context length of up to 131,072 tokens, with specific instructions for deploying YaRN for handling long texts, though static YaRN may impact performance on shorter inputs.
When to Use This Model
This model is particularly well-suited for tasks requiring advanced code generation, understanding, and correction. Its long-context capabilities make it valuable for complex coding projects or scenarios where extensive codebases need to be analyzed or generated. Developers looking for a robust, open-source code LLM with strong general competencies and mathematical abilities will find this model beneficial.