Gille/StrangeMerges_23-7B-slerp
StrangeMerges_23-7B-slerp is a 7 billion parameter language model created by Gille, formed by merging paulml/OGNO-7B and Gille/StrangeMerges_21-7B-slerp using the slerp method. This model leverages a 4096-token context length and achieves an average score of 76.17 on the Open LLM Leaderboard, demonstrating strong performance across various reasoning and language understanding tasks. It is suitable for general-purpose text generation and conversational AI applications.
Loading preview...
StrangeMerges_23-7B-slerp: A Merged 7B Language Model
StrangeMerges_23-7B-slerp is a 7 billion parameter model developed by Gille, created through a strategic merge of two base models: paulml/OGNO-7B and Gille/StrangeMerges_21-7B-slerp. This merge was performed using the slerp (spherical linear interpolation) method via LazyMergekit, allowing for a balanced combination of the source models' capabilities.
Key Capabilities & Performance
This model demonstrates robust performance across a range of benchmarks, as evaluated on the Open LLM Leaderboard. It achieves an average score of 76.17, with notable results including:
- AI2 Reasoning Challenge (25-Shot): 73.55
- HellaSwag (10-Shot): 88.90
- MMLU (5-Shot): 64.87
- TruthfulQA (0-shot): 75.13
- Winogrande (5-shot): 84.29
- GSM8k (5-shot): 70.28
These scores indicate strong general reasoning, common sense, and language understanding abilities. The model is configured with a 4096-token context length, making it suitable for tasks requiring moderate context processing.
Ideal Use Cases
- General Text Generation: Capable of producing coherent and contextually relevant text for various prompts.
- Conversational AI: Its performance on reasoning and truthfulness benchmarks suggests suitability for interactive applications.
- Research and Experimentation: Provides a solid base for further fine-tuning or exploring merged model architectures.