RedHatAI/Qwen2.5-7B-Instruct

TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:May 9, 2025License:apache-2.0Architecture:Transformer Open Weights Cold

RedHatAI/Qwen2.5-7B-Instruct is a 7.61 billion parameter instruction-tuned causal language model developed by Qwen, based on the Qwen2.5 series. It features a transformer architecture with RoPE, SwiGLU, and RMSNorm, supporting a full context length of 131,072 tokens. This model significantly improves capabilities in coding, mathematics, instruction following, long text generation, and structured output (especially JSON), with multilingual support for over 29 languages.

Loading preview...

Qwen2.5-7B-Instruct: Enhanced Multilingual LLM

RedHatAI/Qwen2.5-7B-Instruct is an instruction-tuned causal language model from the Qwen2.5 series, developed by Qwen. This 7.61 billion parameter model builds upon the Qwen2 architecture, incorporating improvements in several key areas. It supports an extensive context length of 131,072 tokens and can generate up to 8,192 tokens.

Key Capabilities & Improvements

  • Enhanced Knowledge & Reasoning: Significantly improved performance in coding and mathematics due to specialized expert models.
  • Instruction Following: Demonstrates substantial advancements in adhering to instructions and generating long texts (over 8K tokens).
  • Structured Data Handling: Better understanding of structured data like tables and improved generation of structured outputs, particularly JSON.
  • Robustness: More resilient to diverse system prompts, enhancing role-play and chatbot condition-setting.
  • Multilingual Support: Offers comprehensive support for over 29 languages, including Chinese, English, French, Spanish, German, and Japanese.
  • Long-Context Processing: Utilizes YaRN for efficient handling of texts up to 128K tokens, with specific instructions for deployment with vLLM.

Architecture & Training

This model is built on a transformer architecture featuring RoPE, SwiGLU, RMSNorm, and Attention QKV bias. It underwent both pretraining and post-training stages. For detailed evaluation results and performance benchmarks, refer to the official Qwen2.5 blog and documentation.