NeverSleep/Mistral-11B-AirOmniMix

TEXT GENERATIONConcurrency Cost:1Model Size:10.7BQuant:FP8Ctx Length:4kPublished:Oct 14, 2023License:cc-by-nc-4.0Architecture:Transformer0.0K Open Weights Cold

NeverSleep/Mistral-11B-AirOmniMix is a 10.7 billion parameter language model created by NeverSleep, built by merging four Mistral-7B base models: Open-Orca/Mistral-7B-OpenOrca, akjindal53244/Mistral-7B-v0.1-Open-Platypus, teknium/CollectiveCognition-v1.1-Mistral-7B, and teknium/airoboros-mistral2.2-7b. This model leverages a complex slerp merge method to combine the strengths of its constituent models, offering a 4096-token context length. It is designed for general-purpose conversational and instruction-following tasks, demonstrating competitive performance across various benchmarks like ARC Challenge, HellaSwag, and TruthfulQA.

Loading preview...

NeverSleep/Mistral-11B-AirOmniMix Overview

NeverSleep/Mistral-11B-AirOmniMix is a 10.7 billion parameter language model developed by NeverSleep, constructed through a sophisticated merging process using mergekit. This model integrates four distinct Mistral-7B variants, specifically:

  • Open-Orca/Mistral-7B-OpenOrca
  • akjindal53244/Mistral-7B-v0.1-Open-Platypus
  • teknium/CollectiveCognition-v1.1-Mistral-7B
  • teknium/airoboros-mistral2.2-7b

Key Capabilities & Merging Strategy

The model employs a multi-stage merging approach. Initially, two intermediate merges were performed:

  • Mistral-11B-OpenOrcaPlatypus: A passthrough merge combining layers from Open-Orca/Mistral-7B-OpenOrca and akjindal53244/Mistral-7B-v0.1-Open-Platypus.
  • Mistral-11B-CC-Airo: Another passthrough merge integrating CollectiveCognition-v1.1-Mistral-7B and airoboros-mistral2.2-7b.

The final Mistral-11B-AirOmniMix is a slerp merge of these two 11B intermediate models, with specific parameter weighting applied to different components like lm_head, embed_tokens, self_attn, mlp, and layernorm to optimize performance. This intricate merging aims to harness the diverse strengths of its base models.

Performance & Prompting

With a context length of 4096 tokens, the model demonstrates solid performance on various benchmarks, including:

  • ARC Challenge: 0.5836 (acc_norm)
  • HellaSwag: 0.8250 (acc_norm)
  • TruthfulQA: 0.5606 (mc2)
  • Winogrande: 0.7395 (acc)

It supports flexible prompting, with a recommended simple USER: <prompt>\nASSISTANT: format, but also accommodates instruction-based templates or those from its source models.