kaitchup/TheMayonnaise

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Jan 27, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

TheMayonnaise is a 7 billion parameter causal language model developed by The Kaitchup, created using a TIES-merging mixture of experts approach based on Mistral-7B-v0.1. This model is optimized for general reasoning and language understanding, achieving an average score of 74.94 on the Open LLM Leaderboard. It is designed for English language tasks and offers a context length of 8192 tokens.

Loading preview...

Overview

TheMayonnaise is a 7 billion parameter causal language model developed by The Kaitchup. It is an English-language model built upon mistralai/Mistral-7B-v0.1 using a TIES-merging mixture of experts (MoE) approach. This model's creation process is detailed in an article titled "The Mayonnaise: Rank First on the Open LLM Leaderboard with TIES-Merging."

Key Capabilities & Performance

The model demonstrates strong performance across various benchmarks, as evaluated on the Open LLM Leaderboard:

  • Average Score: 74.94
  • AI2 Reasoning Challenge (25-Shot): 73.46
  • HellaSwag (10-Shot): 88.46
  • MMLU (5-Shot): 64.88
  • TruthfulQA (0-shot): 69.19
  • Winogrande (5-shot): 84.29
  • GSM8k (5-shot): 69.37

Unique Aspects

This model stands out due to its creation method, utilizing mergekit with a TIES-merging strategy. It combines mncai/mistral-7b-dpo-v5, kaitchup/Mayonnaise-4in1-02, and BarryFutureman/NeuralTurdusVariant1-7B to achieve its performance. The model operates under an Apache 2.0 License.

Use Cases

Given its general-purpose nature and strong benchmark results, TheMayonnaise is suitable for a wide range of English NLP tasks requiring reasoning, common sense, and language understanding.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p