Qwen/Qwen2.5-Coder-1.5B is a 1.54 billion parameter causal language model from the Qwen2.5-Coder series, developed by Qwen. This model is specifically designed and significantly improved for code generation, code reasoning, and code fixing, building upon the Qwen2.5 architecture. It features a 32,768-token context length and is optimized for real-world coding applications and maintaining strong mathematical and general competencies.
Loading preview...
Qwen2.5-Coder-1.5B Overview
Qwen2.5-Coder-1.5B is part of the latest Qwen2.5-Coder series, a family of code-specific large language models developed by Qwen. This 1.54 billion parameter model is a pre-trained causal language model built on a transformer architecture, featuring RoPE, SwiGLU, RMSNorm, Attention QKV bias, and tied word embeddings. It boasts a substantial context length of 32,768 tokens.
Key Capabilities & Improvements
- Enhanced Code Performance: Significant improvements in code generation, code reasoning, and code fixing compared to its predecessor, CodeQwen1.5.
- Extensive Training: Trained on 5.5 trillion tokens, including a large proportion of source code, text-code grounding, and synthetic data.
- Foundation for Code Agents: Provides a robust foundation for real-world applications like Code Agents, balancing strong coding abilities with general competencies and mathematics.
- Architectural Features: Utilizes a transformer architecture with 28 layers, 12 attention heads for Q, and 2 for KV (GQA).
Recommended Use Cases
- Code-centric Tasks: Ideal for tasks requiring advanced code generation, debugging, and understanding.
- Further Fine-tuning: Recommended as a base model for post-training techniques such as Supervised Fine-Tuning (SFT), Reinforcement Learning from Human Feedback (RLHF), or continued pretraining to adapt it for specific conversational or fill-in-the-middle tasks.
For detailed evaluation results and further information, refer to the official Qwen2.5-Coder blog and GitHub repository.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.