Model Overview
Gille/StrangeMerges_11-7B-slerp is a 7 billion parameter language model developed by Gille. It is a product of merging two distinct models, Gille/StrangeMerges_10-7B-slerp and mlabonne/NeuralBeagle14-7B, utilizing the slerp (spherical linear interpolation) merge method. This approach combines the strengths of its constituent models to enhance overall performance.
Key Capabilities & Performance
This model exhibits robust performance across a range of benchmarks, as evaluated on the Open LLM Leaderboard. It achieves an average score of 74.80, indicating strong general language understanding and reasoning. Notable benchmark results include:
- AI2 Reasoning Challenge (25-Shot): 72.53
- HellaSwag (10-Shot): 88.20
- MMLU (5-Shot): 65.04
- TruthfulQA (0-shot): 69.81
- Winogrande (5-shot): 82.32
- GSM8k (5-shot): 70.89
These scores highlight its proficiency in common sense reasoning, factual recall, and mathematical problem-solving. The model supports a context length of 4096 tokens.
Usage
Developers can easily integrate StrangeMerges_11-7B-slerp into their projects using the Hugging Face transformers library. The provided Python code snippet demonstrates how to load the model and tokenizer, apply a chat template, and generate text, making it accessible for various applications requiring conversational AI or text generation.
Good For
- General-purpose text generation and understanding tasks.
- Applications requiring strong reasoning and common sense.
- Developers looking for a capable 7B model with balanced performance across multiple benchmarks.