mychen76/mistral-7b-merged-ties
mychen76/mistral-7b-merged-ties is a 7 billion parameter language model created by mychen76, formed by merging mistralai/Mistral-7B-v0.1, OpenPipe/mistral-ft-optimized-1218, and mlabonne/NeuralHermes-2.5-Mistral-7B using the TIES merging method. This model leverages the Mistral architecture and is optimized for general language understanding and generation tasks, demonstrating strong performance across various benchmarks. It is suitable for applications requiring a capable 7B model with a 4096 token context length.
Loading preview...
Overview
mychen76/mistral-7b-merged-ties is a 7 billion parameter language model built upon the Mistral architecture. It was created by mychen76 through a merge of three distinct Mistral-based models: mistralai/Mistral-7B-v0.1, OpenPipe/mistral-ft-optimized-1218, and mlabonne/NeuralHermes-2.5-Mistral-7B. The merging process utilized the TIES (Trimmed, Iterative, Extracted, and Merged) method, with mistralai/Mistral-7B-v0.1 serving as the base model.
Key Capabilities & Performance
This merged model demonstrates robust performance across a range of benchmarks, as evaluated on the Open LLM Leaderboard. Key scores include:
- Average Score: 71.37
- AI2 Reasoning Challenge (25-Shot): 67.92
- HellaSwag (10-Shot): 85.93
- MMLU (5-Shot): 64.07
- TruthfulQA (0-shot): 61.31
- Winogrande (5-shot): 80.03
- GSM8k (5-shot): 68.54
These results indicate strong general reasoning, common sense, and language understanding abilities. The model operates with a context length of 4096 tokens.
Good for
- General-purpose text generation and understanding tasks.
- Applications requiring a 7B parameter model with balanced performance across various benchmarks.
- Developers looking for a Mistral-based model that combines strengths from multiple fine-tuned versions.