Gille/StrangeMerges_35-7B-slerp
Gille/StrangeMerges_35-7B-slerp is a 7 billion parameter language model created by Gille through a slerp merge of StrangeMerges_34-7B-slerp and StrangeMerges_32-7B-slerp. This model leverages a specific layer-wise merging strategy to combine the strengths of its constituent models, achieving an average score of 74.75 on the Open LLM Leaderboard. It is designed for general language understanding and generation tasks, demonstrating solid performance across various reasoning and common sense benchmarks.
Loading preview...
Model Overview
Gille/StrangeMerges_35-7B-slerp is a 7 billion parameter language model developed by Gille. It is a product of a slerp merge (spherical linear interpolation) combining two previous models: Gille/StrangeMerges_34-7B-slerp and Gille/StrangeMerges_32-7B-slerp. This merging technique allows for a nuanced combination of model weights, specifically applying different interpolation values across self-attention and MLP layers.
Key Capabilities & Performance
This model demonstrates strong general-purpose language understanding and generation. Its performance on the Open LLM Leaderboard highlights its capabilities across several benchmarks:
- Avg. Score: 74.75
- AI2 Reasoning Challenge (25-Shot): 71.67
- HellaSwag (10-Shot): 88.34
- MMLU (5-Shot): 64.66
- TruthfulQA (0-shot): 75.76
- Winogrande (5-shot): 83.35
- GSM8k (5-shot): 64.75
Good for
- General text generation and conversational AI.
- Tasks requiring common sense reasoning and factual recall.
- Applications where a balanced performance across various benchmarks is desired from a 7B parameter model.