NeverSleep/Mistral-11B-AirOmniMix
NeverSleep/Mistral-11B-AirOmniMix is a 10.7 billion parameter language model created by NeverSleep, built by merging four Mistral-7B base models: Open-Orca/Mistral-7B-OpenOrca, akjindal53244/Mistral-7B-v0.1-Open-Platypus, teknium/CollectiveCognition-v1.1-Mistral-7B, and teknium/airoboros-mistral2.2-7b. This model leverages a complex slerp merge method to combine the strengths of its constituent models, offering a 4096-token context length. It is designed for general-purpose conversational and instruction-following tasks, demonstrating competitive performance across various benchmarks like ARC Challenge, HellaSwag, and TruthfulQA.
Loading preview...
NeverSleep/Mistral-11B-AirOmniMix Overview
NeverSleep/Mistral-11B-AirOmniMix is a 10.7 billion parameter language model developed by NeverSleep, constructed through a sophisticated merging process using mergekit. This model integrates four distinct Mistral-7B variants, specifically:
- Open-Orca/Mistral-7B-OpenOrca
- akjindal53244/Mistral-7B-v0.1-Open-Platypus
- teknium/CollectiveCognition-v1.1-Mistral-7B
- teknium/airoboros-mistral2.2-7b
Key Capabilities & Merging Strategy
The model employs a multi-stage merging approach. Initially, two intermediate merges were performed:
- Mistral-11B-OpenOrcaPlatypus: A
passthroughmerge combining layers from Open-Orca/Mistral-7B-OpenOrca and akjindal53244/Mistral-7B-v0.1-Open-Platypus. - Mistral-11B-CC-Airo: Another
passthroughmerge integrating CollectiveCognition-v1.1-Mistral-7B and airoboros-mistral2.2-7b.
The final Mistral-11B-AirOmniMix is a slerp merge of these two 11B intermediate models, with specific parameter weighting applied to different components like lm_head, embed_tokens, self_attn, mlp, and layernorm to optimize performance. This intricate merging aims to harness the diverse strengths of its base models.
Performance & Prompting
With a context length of 4096 tokens, the model demonstrates solid performance on various benchmarks, including:
- ARC Challenge: 0.5836 (acc_norm)
- HellaSwag: 0.8250 (acc_norm)
- TruthfulQA: 0.5606 (mc2)
- Winogrande: 0.7395 (acc)
It supports flexible prompting, with a recommended simple USER: <prompt>\nASSISTANT: format, but also accommodates instruction-based templates or those from its source models.