Name: NeverSleep/Mistral-11B-AirOmniMix API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: NeverSleep

NeverSleep/Mistral-11B-AirOmniMix Overview

NeverSleep/Mistral-11B-AirOmniMix is a 10.7 billion parameter language model developed by NeverSleep, constructed through a sophisticated merging process using mergekit. This model integrates four distinct Mistral-7B variants, specifically:

Open-Orca/Mistral-7B-OpenOrca
akjindal53244/Mistral-7B-v0.1-Open-Platypus
teknium/CollectiveCognition-v1.1-Mistral-7B
teknium/airoboros-mistral2.2-7b

Key Capabilities & Merging Strategy

The model employs a multi-stage merging approach. Initially, two intermediate merges were performed:

Mistral-11B-OpenOrcaPlatypus: A passthrough merge combining layers from Open-Orca/Mistral-7B-OpenOrca and akjindal53244/Mistral-7B-v0.1-Open-Platypus.
Mistral-11B-CC-Airo: Another passthrough merge integrating CollectiveCognition-v1.1-Mistral-7B and airoboros-mistral2.2-7b.

The final Mistral-11B-AirOmniMix is a slerp merge of these two 11B intermediate models, with specific parameter weighting applied to different components like lm_head, embed_tokens, self_attn, mlp, and layernorm to optimize performance. This intricate merging aims to harness the diverse strengths of its base models.

Performance & Prompting

With a context length of 4096 tokens, the model demonstrates solid performance on various benchmarks, including:

ARC Challenge: 0.5836 (acc_norm)
HellaSwag: 0.8250 (acc_norm)
TruthfulQA: 0.5606 (mc2)
Winogrande: 0.7395 (acc)

It supports flexible prompting, with a recommended simple USER: <prompt>\nASSISTANT: format, but also accommodates instruction-based templates or those from its source models.

Overview

NeverSleep/Mistral-11B-AirOmniMix Overview

Key Capabilities & Merging Strategy

Performance & Prompting

Full Model Card (README)