wvnvwn/qwen-2.5-7B-Instruct-Resta-lr5e-5-scale0.5
The wvnvwn/qwen-2.5-7B-Instruct-Resta-lr5e-5-scale0.5 is a 7.6 billion parameter instruction-tuned language model based on the Qwen2.5 architecture, created by wvnvwn through a linear merge of three Qwen2.5-7B-Instruct variants. This model integrates specific fine-tuning for general instruction following and GSM8K mathematical reasoning, offering a balanced performance profile. It is designed for applications requiring robust instruction adherence and improved numerical problem-solving capabilities within a 32K context window.
Loading preview...
Overview
This model, wvnvwn/qwen-2.5-7B-Instruct-Resta-lr5e-5-scale0.5, is a 7.6 billion parameter instruction-tuned language model built upon the Qwen2.5 architecture. It was created by wvnvwn using a linear merge method via mergekit.
Key Capabilities
- Instruction Following: Inherits and enhances general instruction-following capabilities from its base Qwen2.5-7B-Instruct model.
- Mathematical Reasoning: Incorporates specific fine-tuning from a GSM8K-optimized variant, suggesting improved performance on arithmetic and logical reasoning tasks.
- Merged Architecture: Combines weights from three distinct Qwen2.5-7B-Instruct models, including a base instruction model, a general SSFT (Supervised Fine-Tuning) variant, and a GSM8K-focused SSFT variant.
Good For
- General-purpose instruction following: Suitable for a wide range of conversational and task-oriented applications.
- Mathematical problem-solving: Potentially offers enhanced accuracy for tasks involving numerical reasoning, thanks to the GSM8K-tuned component.
- Developers seeking a balanced Qwen2.5 variant: Provides a blend of general instruction adherence and specialized reasoning, without being overly specialized in one area.