gagan3012/MetaModelv3

TEXT GENERATIONConcurrency Cost:1Model Size:10.7BQuant:FP8Ctx Length:4kPublished:Jan 5, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

gagan3012/MetaModelv3 is a 10.7 billion parameter language model, a hybrid of jeonsworld/CarbonVillain-en-10.7B-v4 and jeonsworld/CarbonVillain-en-10.7B-v2, with a 4096 token context length. It demonstrates strong general reasoning capabilities, achieving an average score of 74.39 on the Open LLM Leaderboard, making it suitable for a range of general-purpose language understanding and generation tasks.

Loading preview...

MetaModelv3 Overview

MetaModelv3 is a 10.7 billion parameter language model, built as a hybrid of two CarbonVillain-en-10.7B variants: jeonsworld/CarbonVillain-en-10.7B-v4 and jeonsworld/CarbonVillain-en-10.7B-v2. This model features a context length of 4096 tokens.

Performance Highlights

Evaluated on the Open LLM Leaderboard, MetaModelv3 achieved an average score of 74.39. Key benchmark results include:

  • ARC (25-shot): 71.16
  • HellaSwag (10-shot): 88.39
  • MMLU (5-shot): 66.32
  • TruthfulQA (0-shot): 71.86
  • Winogrande (5-shot): 83.35
  • GSM8K (5-shot): 65.28

These scores indicate solid performance across various reasoning, common sense, and knowledge-based tasks, including mathematical problem-solving (GSM8K).

Use Cases

MetaModelv3 is well-suited for applications requiring:

  • General-purpose text generation and understanding.
  • Reasoning tasks, as evidenced by its ARC and MMLU scores.
  • Common sense reasoning, supported by HellaSwag and Winogrande results.
  • Fact-based question answering, indicated by its TruthfulQA performance.