allknowingroger/HomerSlerp4-7B
allknowingroger/HomerSlerp4-7B is a 7.6 billion parameter language model created by allknowingroger using the SLERP merge method, combining allknowingroger/Qwen2.5-7B-task8 and allknowingroger/HomerSlerp2-7B. This model is designed for general language tasks, leveraging its merged architecture to achieve a balanced performance across various benchmarks. It features a 32768 token context length, making it suitable for processing longer inputs.
Loading preview...
Model Overview
allknowingroger/HomerSlerp4-7B is a 7.6 billion parameter language model developed by allknowingroger. It was created using the SLERP merge method from mergekit, combining two base models: allknowingroger/Qwen2.5-7B-task8 and allknowingroger/HomerSlerp2-7B. The merge configuration utilized a V-shaped curve for parameter interpolation, specifically weighting HomerSlerp2-7B for input and output layers and Qwen2.5-7B-task8 for middle layers.
Performance Highlights
Evaluated on the Open LLM Leaderboard, HomerSlerp4-7B demonstrates a balanced performance across various tasks. Key metrics include:
- Avg. Score: 28.62
- IFEval (0-Shot): 43.74
- BBH (3-Shot): 36.79
- MATH Lvl 5 (4-Shot): 29.53
- MMLU-PRO (5-shot): 38.58
Detailed evaluation results are available on the Open LLM Leaderboard.
Use Cases
This model is suitable for general-purpose language generation and understanding tasks where a 7.6 billion parameter model with a 32768 token context window is appropriate. Its merged architecture aims to leverage the strengths of its constituent models for diverse applications.