Gille/StrangeMerges_15-7B-slerp
Gille/StrangeMerges_15-7B-slerp is a 7 billion parameter language model created by Gille, resulting from a slerp merge of Gille/StrangeMerges_14-7B-slerp and CultriX/Wernicke-7B-v9. This model is designed for general text generation tasks, leveraging its merged architecture to achieve a balanced performance across various benchmarks. It features a 4096-token context length and demonstrates an average score of 72.41 on the Open LLM Leaderboard, indicating solid reasoning and language understanding capabilities.
Loading preview...
Model Overview
Gille/StrangeMerges_15-7B-slerp is a 7 billion parameter language model developed by Gille. It is a product of a slerp merge (spherical linear interpolation) combining two distinct models: Gille/StrangeMerges_14-7B-slerp and CultriX/Wernicke-7B-v9. This merging technique aims to blend the strengths of its constituent models, offering a versatile solution for various natural language processing tasks.
Key Capabilities & Performance
This model demonstrates strong performance across a range of benchmarks, as evaluated on the Open LLM Leaderboard. Key scores include:
- Average Score: 72.41
- AI2 Reasoning Challenge (25-Shot): 68.00
- HellaSwag (10-Shot): 86.82
- MMLU (5-Shot): 65.58
- TruthfulQA (0-shot): 59.99
- Winogrande (5-shot): 82.56
- GSM8k (5-shot): 71.49
These results indicate a balanced capability in reasoning, common sense, language understanding, and mathematical problem-solving. The model supports a context length of 4096 tokens.
Usage
Developers can easily integrate StrangeMerges_15-7B-slerp into their projects using the transformers library, with provided Python code examples for text generation. The model is configured to use bfloat16 for efficient computation.