StrangeMerges_23-7B-slerp: A Merged 7B Language Model
StrangeMerges_23-7B-slerp is a 7 billion parameter model developed by Gille, created through a strategic merge of two base models: paulml/OGNO-7B and Gille/StrangeMerges_21-7B-slerp. This merge was performed using the slerp (spherical linear interpolation) method via LazyMergekit, allowing for a balanced combination of the source models' capabilities.
Key Capabilities & Performance
This model demonstrates robust performance across a range of benchmarks, as evaluated on the Open LLM Leaderboard. It achieves an average score of 76.17, with notable results including:
- AI2 Reasoning Challenge (25-Shot): 73.55
- HellaSwag (10-Shot): 88.90
- MMLU (5-Shot): 64.87
- TruthfulQA (0-shot): 75.13
- Winogrande (5-shot): 84.29
- GSM8k (5-shot): 70.28
These scores indicate strong general reasoning, common sense, and language understanding abilities. The model is configured with a 4096-token context length, making it suitable for tasks requiring moderate context processing.
Ideal Use Cases
- General Text Generation: Capable of producing coherent and contextually relevant text for various prompts.
- Conversational AI: Its performance on reasoning and truthfulness benchmarks suggests suitability for interactive applications.
- Research and Experimentation: Provides a solid base for further fine-tuning or exploring merged model architectures.