Name: Qwen/Qwen2.5-Coder-1.5B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Qwen

Qwen2.5-Coder-1.5B Overview

Qwen2.5-Coder-1.5B is part of the latest Qwen2.5-Coder series, a family of code-specific large language models developed by Qwen. This 1.54 billion parameter model is a pre-trained causal language model built on a transformer architecture, featuring RoPE, SwiGLU, RMSNorm, Attention QKV bias, and tied word embeddings. It boasts a substantial context length of 32,768 tokens.

Key Capabilities & Improvements

Enhanced Code Performance: Significant improvements in code generation, code reasoning, and code fixing compared to its predecessor, CodeQwen1.5.
Extensive Training: Trained on 5.5 trillion tokens, including a large proportion of source code, text-code grounding, and synthetic data.
Foundation for Code Agents: Provides a robust foundation for real-world applications like Code Agents, balancing strong coding abilities with general competencies and mathematics.
Architectural Features: Utilizes a transformer architecture with 28 layers, 12 attention heads for Q, and 2 for KV (GQA).

Recommended Use Cases

Code-centric Tasks: Ideal for tasks requiring advanced code generation, debugging, and understanding.
Further Fine-tuning: Recommended as a base model for post-training techniques such as Supervised Fine-Tuning (SFT), Reinforcement Learning from Human Feedback (RLHF), or continued pretraining to adapt it for specific conversational or fill-in-the-middle tasks.

For detailed evaluation results and further information, refer to the official Qwen2.5-Coder blog and GitHub repository.