Name: Qwen/Qwen2.5-7B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Qwen

Qwen2.5-7B Overview

Qwen2.5-7B is a 7.61 billion parameter base causal language model from the Qwen2.5 series, developed by Qwen. It builds upon the Qwen2 architecture, incorporating transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias. This model is specifically designed for pretraining and is intended as a foundation for subsequent fine-tuning, such as SFT or RLHF, rather than direct conversational use.

Key Capabilities & Improvements

Enhanced Knowledge & Reasoning: Significantly improved general knowledge, coding, and mathematics capabilities, benefiting from specialized expert models in these domains.
Instruction Following: Demonstrates substantial improvements in adhering to instructions and generating structured outputs, including JSON.
Long Text Generation: Better performance in generating extended texts, supporting outputs over 8K tokens.
Context Length: Features a robust context window of up to 131,072 tokens.
Multilingual Support: Offers support for over 29 languages, including major global languages like Chinese, English, French, Spanish, and Japanese.
System Prompt Resilience: More robust to diverse system prompts, aiding in role-play and condition-setting for chatbots.

Intended Use

This model is a base language model and is primarily intended for developers to perform post-training steps like supervised fine-tuning (SFT), reinforcement learning from human feedback (RLHF), or continued pretraining. It is not recommended for direct conversational use without further fine-tuning.