Gille/StrangeMerges_17-7B-dare_ties
Gille/StrangeMerges_17-7B-dare_ties is a 7 billion parameter language model created by Gille, formed by merging Gille/StrangeMerges_16-7B-slerp and Gille/StrangeMerges_12-7B-slerp using the dare_ties method. This model achieves an average score of 69.54 on the Open LLM Leaderboard, demonstrating strong general reasoning and language understanding across various benchmarks. With a 4096-token context length, it is suitable for a range of general-purpose text generation and comprehension tasks.
Loading preview...
Overview
Gille/StrangeMerges_17-7B-dare_ties is a 7 billion parameter language model developed by Gille. It is a product of merging two existing models, Gille/StrangeMerges_16-7B-slerp and Gille/StrangeMerges_12-7B-slerp, utilizing the dare_ties merge method. This approach combines the strengths of its constituent models to create a new, potentially more capable iteration.
Key Capabilities
- General Language Understanding: Achieves competitive scores on standard benchmarks, indicating proficiency in various language tasks.
- Reasoning: Demonstrated performance on reasoning tasks like AI2 Reasoning Challenge and GSM8k.
- Context Handling: Supports a context length of 4096 tokens, allowing for processing moderately long inputs.
Performance Benchmarks
The model's performance is evaluated on the Open LLM Leaderboard, with the following key results:
- Average Score: 69.54
- AI2 Reasoning Challenge (25-Shot): 66.64
- HellaSwag (10-Shot): 86.04
- MMLU (5-Shot): 65.07
- TruthfulQA (0-shot): 53.18
- Winogrande (5-shot): 81.93
- GSM8k (5-shot): 64.37
Good For
This model is suitable for developers looking for a 7B parameter model with solid general-purpose capabilities, particularly for tasks requiring reasoning and broad language understanding, as indicated by its benchmark scores.