allknowingroger/Qwenslerp2-7B
allknowingroger/Qwenslerp2-7B is a 7.6 billion parameter language model created by allknowingroger using the SLERP merge method. This model combines fblgit/cybertron-v4-qw7B-MGS and Tsunami-th/Tsunami-0.5x-7B-Instruct, leveraging a V-shaped parameter curve for specific layer weighting. With a context length of 32768 tokens, it is designed for general language tasks, showing an average performance of 30.42 on the Open LLM Leaderboard.
Loading preview...
Model Overview
allknowingroger/Qwenslerp2-7B is a 7.6 billion parameter language model developed by allknowingroger. It was created using the SLERP merge method from mergekit, combining two distinct base models: fblgit/cybertron-v4-qw7B-MGS and Tsunami-th/Tsunami-0.5x-7B-Instruct.
Merge Configuration
The merge utilized a specific configuration with a V-shaped parameter curve (t: [0, 0.5, 1, 0.5, 0]). This approach aims to weight different layers of the merged models, specifically using 'Hermes' for input and output layers and 'WizardMath' for middle layers, suggesting an optimization for certain types of processing or reasoning.
Performance Metrics
Evaluated on the Open LLM Leaderboard, Qwenslerp2-7B achieved an average score of 30.42. Key individual benchmark results include:
- IFEval (0-Shot): 52.94
- BBH (3-Shot): 37.44
- MATH Lvl 5 (4-Shot): 31.87
- MMLU-PRO (5-shot): 39.06
These scores provide insight into its capabilities across instruction following, common sense reasoning, mathematical problem-solving, and general knowledge.
Potential Use Cases
Given its merged architecture and benchmark performance, this model is suitable for:
- General text generation and understanding tasks.
- Applications requiring instruction following.
- Exploration of merged model performance for specific tasks.