Name: Qwen/Qwen2.5-Coder-3B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Qwen

Qwen2.5-Coder-3B Overview

Qwen2.5-Coder-3B is a 3.09 billion parameter model from the Qwen2.5-Coder family, a series of code-specific large language models developed by Qwen. This model builds upon the strong Qwen2.5 foundation, with significant improvements in coding capabilities through extensive pretraining on 5.5 trillion tokens, including source code, text-code grounding, and synthetic data.

Key Capabilities

Enhanced Code Performance: Demonstrates significant advancements in code generation, code reasoning, and code fixing.
Comprehensive Foundation: Designed to support real-world applications like Code Agents, while also maintaining strong performance in mathematics and general language understanding.
Technical Specifications: Features a 3.09 billion parameter transformer architecture with a substantial 32,768-token context length, RoPE, SwiGLU, and RMSNorm.

Intended Use

This base model is primarily intended for further post-training, such as Supervised Fine-Tuning (SFT), Reinforcement Learning from Human Feedback (RLHF), or continued pretraining, to adapt it for specific conversational or fill-in-the-middle tasks. It is not recommended for direct conversational use without further fine-tuning.