Gille/StrangeMerges_12-7B-slerp
Gille/StrangeMerges_12-7B-slerp is a 7 billion parameter language model created by Gille, built by merging Keynote-Technology/KAI-7B-v0.1 and Gille/StrangeMerges_11-7B-slerp using a slerp merge method. This model achieves an average score of 69.13 on the Open LLM Leaderboard, demonstrating strong performance across various reasoning and language understanding tasks. With a 4096-token context length, it is suitable for general-purpose text generation and conversational AI applications.
Loading preview...
StrangeMerges_12-7B-slerp Overview
StrangeMerges_12-7B-slerp is a 7 billion parameter language model developed by Gille. It was created through a sophisticated merging process using LazyMergekit, combining two base models: Keynote-Technology/KAI-7B-v0.1 and Gille/StrangeMerges_11-7B-slerp. The merge utilized a slerp (spherical linear interpolation) method, with specific t parameters applied to self-attention and MLP layers to fine-tune the model's characteristics.
Key Capabilities & Performance
This model demonstrates solid performance across a range of benchmarks, as evaluated on the Open LLM Leaderboard. It achieved an average score of 69.13, with notable results including:
- AI2 Reasoning Challenge (25-Shot): 66.64
- HellaSwag (10-Shot): 85.89
- MMLU (5-Shot): 64.94
- TruthfulQA (0-shot): 52.55
- Winogrande (5-shot): 81.69
- GSM8k (5-shot): 63.08
These scores indicate its proficiency in common sense reasoning, language understanding, and mathematical problem-solving.
Use Cases
StrangeMerges_12-7B-slerp is well-suited for general text generation tasks, conversational AI, and applications requiring robust reasoning capabilities within a 7 billion parameter footprint. Its 4096-token context length supports processing moderately long inputs.