Model Overview
NeverSleep/Mistral-11B-SynthIAirOmniMix is a 10.7 billion parameter language model built upon the Mistral architecture. Developed by NeverSleep, this model is a sophisticated merge of four distinct Mistral-7B variants: SynthIA-7B-v1.5, Mistral-7B-v0.1-Open-Platypus, CollectiveCognition-v1.1-Mistral-7B, and airoboros-mistral2.2-7b. The primary goal of this merge was to investigate whether combining models that share a consistent prompt format could lead to enhanced overall performance, moving away from previous mixes that included Zephyr and OpenOrca.
Key Characteristics
- Architecture: Based on the Mistral-7B family, scaled to 10.7 billion parameters through merging.
- Merging Method: Utilizes a slerp merge method via
mergekit, carefully blending different layers and components of the base models. - Prompt Format: Optimized for a specific prompt template:
(SYSTEM: {context}) - Not mandatory\nUSER: {prompt}\nASSISTANT:, with an alternative format also potentially supported. - Context Length: Supports a context window of 4096 tokens.
Performance Insights
Evaluations on standard benchmarks indicate competitive performance for its size class:
- ARC Challenge: 56.40 (acc_norm)
- HellaSwag: 81.67 (acc_norm)
- MMLU (5-shot): 63.47
- TruthfulQA (0-shot): 55.69
- Winogrande (5-shot): 76.4
Good For
- Developers experimenting with merged models and their impact on performance.
- Applications requiring a capable 10B-parameter model with a focus on general language understanding and generation.
- Use cases where a consistent prompt format across merged components is beneficial for predictable output.