NeverSleep/Mistral-11B-SynthIAirOmniMix

TEXT GENERATIONConcurrency Cost:1Model Size:10.7BQuant:FP8Ctx Length:4kPublished:Oct 14, 2023License:cc-by-nc-4.0Architecture:Transformer0.0K Open Weights Cold

NeverSleep/Mistral-11B-SynthIAirOmniMix is a 10.7 billion parameter merged language model based on the Mistral architecture, created by NeverSleep. This model is a blend of several Mistral-7B variants, including SynthIA-7B-v1.5, Mistral-7B-v0.1-Open-Platypus, CollectiveCognition-v1.1-Mistral-7B, and airoboros-mistral2.2-7b, using a slerp merge method. It is designed to explore the effectiveness of merging models with consistent prompt formats, aiming for improved general performance across various tasks within its 4096-token context window.

Loading preview...

Model Overview

NeverSleep/Mistral-11B-SynthIAirOmniMix is a 10.7 billion parameter language model built upon the Mistral architecture. Developed by NeverSleep, this model is a sophisticated merge of four distinct Mistral-7B variants: SynthIA-7B-v1.5, Mistral-7B-v0.1-Open-Platypus, CollectiveCognition-v1.1-Mistral-7B, and airoboros-mistral2.2-7b. The primary goal of this merge was to investigate whether combining models that share a consistent prompt format could lead to enhanced overall performance, moving away from previous mixes that included Zephyr and OpenOrca.

Key Characteristics

  • Architecture: Based on the Mistral-7B family, scaled to 10.7 billion parameters through merging.
  • Merging Method: Utilizes a slerp merge method via mergekit, carefully blending different layers and components of the base models.
  • Prompt Format: Optimized for a specific prompt template: (SYSTEM: {context}) - Not mandatory\nUSER: {prompt}\nASSISTANT:, with an alternative format also potentially supported.
  • Context Length: Supports a context window of 4096 tokens.

Performance Insights

Evaluations on standard benchmarks indicate competitive performance for its size class:

  • ARC Challenge: 56.40 (acc_norm)
  • HellaSwag: 81.67 (acc_norm)
  • MMLU (5-shot): 63.47
  • TruthfulQA (0-shot): 55.69
  • Winogrande (5-shot): 76.4

Good For

  • Developers experimenting with merged models and their impact on performance.
  • Applications requiring a capable 10B-parameter model with a focus on general language understanding and generation.
  • Use cases where a consistent prompt format across merged components is beneficial for predictable output.