LeeAeron/Qwen2.5-7B
LeeAeron/Qwen2.5-7B is a 7.61 billion parameter causal language model from the Qwen2.5 series, developed by Qwen Team. This base model features a transformer architecture with RoPE, SwiGLU, and RMSNorm, supporting a 131,072-token context length. It significantly improves upon Qwen2 with enhanced knowledge, coding, and mathematical capabilities, alongside better instruction following and structured data understanding. It is designed for pretraining and further fine-tuning for specialized applications.
Loading preview...
Qwen2.5-7B: An Enhanced Base Language Model
Qwen2.5-7B is a 7.61 billion parameter base causal language model, part of the latest Qwen2.5 series developed by the Qwen Team. This model builds upon its predecessor, Qwen2, by incorporating significant advancements across several key areas. It features a robust transformer architecture, including RoPE, SwiGLU, and RMSNorm, and supports an extensive context length of 131,072 tokens.
Key Capabilities & Improvements
- Enhanced Knowledge & Reasoning: Significantly improved capabilities in general knowledge, coding, and mathematics, leveraging specialized expert models.
- Instruction Following: Demonstrates substantial improvements in adhering to instructions and generating long texts (over 8K tokens).
- Structured Data Handling: Better understanding of structured data, such as tables, and improved generation of structured outputs, particularly JSON.
- Robustness: More resilient to diverse system prompts, enhancing role-play and condition-setting for chatbots.
- Multilingual Support: Offers broad multilingual capabilities across more than 29 languages, including major global languages.
Intended Use
This repository contains the base Qwen2.5-7B model, which is primarily intended for pretraining. It is not recommended for direct conversational use without further post-training steps like Supervised Fine-Tuning (SFT), Reinforcement Learning from Human Feedback (RLHF), or continued pretraining. For detailed evaluation results and performance metrics, refer to the official Qwen2.5 blog.
For more information, visit the Qwen2.5 GitHub repository and documentation.