Gille/StrangeMerges_18-7B-dare_ties
Gille/StrangeMerges_18-7B-dare_ties is a 7 billion parameter language model created by Gille, formed by merging Gille/StrangeMerges_17-7B-dare_ties and teknium/OpenHermes-2.5-Mistral-7B using the dare_ties method. This model demonstrates strong general reasoning capabilities, achieving an average score of 67.06 on the Open LLM Leaderboard across various benchmarks. It is suitable for tasks requiring robust understanding and generation, with a context length of 4096 tokens.
Loading preview...
Model Overview
Gille/StrangeMerges_18-7B-dare_ties is a 7 billion parameter language model developed by Gille. It is a product of a merge operation, combining the strengths of two distinct models: Gille/StrangeMerges_17-7B-dare_ties and teknium/OpenHermes-2.5-Mistral-7B. This merge was performed using the dare_ties method, a technique designed to integrate different model architectures or fine-tunes effectively.
Key Capabilities & Performance
This model exhibits solid performance across a range of general language understanding and reasoning tasks, as evidenced by its evaluation on the Open LLM Leaderboard. It achieved an average score of 67.06, with notable results in specific areas:
- AI2 Reasoning Challenge (25-Shot): 64.08
- HellaSwag (10-Shot): 84.37
- MMLU (5-Shot): 63.65
- TruthfulQA (0-shot): 52.17
- Winogrande (5-shot): 77.27
- GSM8k (5-shot): 60.80
Use Cases
Given its balanced performance across various benchmarks, StrangeMerges_18-7B-dare_ties is well-suited for general-purpose applications requiring robust language generation and comprehension. Its capabilities make it a strong candidate for tasks such as:
- General question answering
- Text summarization
- Content generation
- Reasoning-based tasks