Gille/StrangeMerges_8-7B-slerp
Gille/StrangeMerges_8-7B-slerp is a 7 billion parameter language model created by Gille, formed by merging Gille/StrangeMerges_7-7B-slerp and Gille/StrangeMerges_5-7B-ties using the slerp method. This model demonstrates an average performance of 73.39 on the Open LLM Leaderboard, with notable scores in reasoning and common sense benchmarks. It is suitable for general language generation tasks requiring a balance of performance and efficiency within a 4096-token context window.
Loading preview...
Overview
Gille/StrangeMerges_8-7B-slerp is a 7 billion parameter language model developed by Gille. It is a product of merging two existing models, Gille/StrangeMerges_7-7B-slerp and Gille/StrangeMerges_5-7B-ties, utilizing the slerp (spherical linear interpolation) merge method via LazyMergekit. This merging technique allows for combining the strengths of its constituent models.
Key Capabilities & Performance
This model has been evaluated on the Open LLM Leaderboard, achieving an overall average score of 73.39. Specific benchmark results include:
- AI2 Reasoning Challenge (25-Shot): 71.08
- HellaSwag (10-Shot): 87.75
- MMLU (5-Shot): 65.26
- TruthfulQA (0-shot): 64.52
- Winogrande (5-shot): 84.45
- GSM8k (5-shot): 67.25
These scores indicate a strong performance across various reasoning, common sense, and language understanding tasks. The model operates with a context length of 4096 tokens.
Good For
- General text generation and conversational AI.
- Applications requiring robust performance in reasoning and common sense tasks.
- Developers looking for a merged model that balances different capabilities from its base components.