Qwen2.5-7B-Instruct: Enhanced Multilingual LLM
RedHatAI/Qwen2.5-7B-Instruct is an instruction-tuned causal language model from the Qwen2.5 series, developed by Qwen. This 7.61 billion parameter model builds upon the Qwen2 architecture, incorporating improvements in several key areas. It supports an extensive context length of 131,072 tokens and can generate up to 8,192 tokens.
Key Capabilities & Improvements
- Enhanced Knowledge & Reasoning: Significantly improved performance in coding and mathematics due to specialized expert models.
- Instruction Following: Demonstrates substantial advancements in adhering to instructions and generating long texts (over 8K tokens).
- Structured Data Handling: Better understanding of structured data like tables and improved generation of structured outputs, particularly JSON.
- Robustness: More resilient to diverse system prompts, enhancing role-play and chatbot condition-setting.
- Multilingual Support: Offers comprehensive support for over 29 languages, including Chinese, English, French, Spanish, German, and Japanese.
- Long-Context Processing: Utilizes YaRN for efficient handling of texts up to 128K tokens, with specific instructions for deployment with vLLM.
Architecture & Training
This model is built on a transformer architecture featuring RoPE, SwiGLU, RMSNorm, and Attention QKV bias. It underwent both pretraining and post-training stages. For detailed evaluation results and performance benchmarks, refer to the official Qwen2.5 blog and documentation.