Model Overview
Gille/StrangeMerges_13-7B-slerp is a 7 billion parameter language model developed by Gille. It is a product of a spherical linear interpolation (slerp) merge, combining two distinct base models: Gille/StrangeMerges_12-7B-slerp and uukuguy/speechless-zephyr-code-functionary-7b. This merging technique aims to leverage the capabilities of both source models, resulting in a versatile language model.
Key Capabilities
- Merged Architecture: Utilizes a slerp merge method, specifically applying different interpolation values (
t) to self-attention and MLP layers, suggesting an attempt to fine-tune the contribution of each base model's components. - General-Purpose Performance: Achieves an average score of 66.06 on the Open LLM Leaderboard, indicating solid performance across a range of tasks.
- Reasoning and Common Sense: Demonstrates capabilities in reasoning (AI2 Reasoning Challenge: 63.82) and common sense understanding (HellaSwag: 84.95, Winogrande: 79.87).
- Knowledge and Problem Solving: Scores 64.90 on MMLU (Massive Multitask Language Understanding) and 54.21 on GSM8k (mathematical word problems), showcasing its ability to handle complex academic and arithmetic tasks.
Good for
- Balanced Applications: Ideal for use cases requiring a general-purpose LLM with a balanced performance profile across various benchmarks.
- Research into Merged Models: Provides an example of a slerp-merged model, useful for developers interested in exploring model merging techniques and their impact on performance.
- Instruction Following: Given its base models, it is likely to perform well in instruction-following scenarios, though specific instruction-tuning details are not provided in the merge configuration.