gagan3012/MetaModel

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:10.7BQuant:FP8Ctx Length:4kPublished:Jan 3, 2024License:apache-2.0Architecture:Transformer Open Weights Warm

gagan3012/MetaModel is a 10.7 billion parameter language model created by gagan3012, formed by merging jeonsworld/CarbonVillain-en-10.7B-v4 and kekmodel/StopCarbon-10.7B-v5 using the slerp method. This merged model demonstrates a balanced performance across various benchmarks, including an average score of 74.4 on the Open LLM Leaderboard. It is suitable for general-purpose language understanding and generation tasks, particularly those requiring robust performance across diverse academic and reasoning challenges.

Loading preview...

MetaModel Overview

MetaModel is a 10.7 billion parameter language model developed by gagan3012. It is a product of merging two distinct models, jeonsworld/CarbonVillain-en-10.7B-v4 and kekmodel/StopCarbon-10.7B-v5, utilizing the slerp merge method via mergekit. This merging strategy aims to combine the strengths of its constituent models.

Key Capabilities & Performance

Evaluated on the Open LLM Leaderboard, MetaModel achieved an average score of 74.4. Notable benchmark results include:

  • ARC (25-shot): 71.08
  • HellaSwag (10-shot): 88.45
  • MMLU (5-shot): 66.26
  • TruthfulQA (0-shot): 71.84
  • Winogrande (5-shot): 83.43
  • GSM8K (5-shot): 65.35

These scores indicate a strong general understanding and reasoning capability across a variety of tasks, from common sense reasoning to academic subjects and mathematical problem-solving.

Use Cases

MetaModel is well-suited for applications requiring a versatile language model with solid performance across a broad spectrum of tasks. Its balanced benchmark results suggest it can be effectively used for:

  • General text generation and comprehension
  • Question answering
  • Reasoning tasks
  • Educational support in various subjects