Gille/StrangeMerges_29-7B-dare_ties

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Feb 21, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

Gille/StrangeMerges_29-7B-dare_ties is a 7 billion parameter language model created by Gille, formed by merging Gille/StrangeMerges_21-7B-slerp and CultriX/MonaTrix-v4 using the dare_ties method. This model demonstrates strong general reasoning capabilities, achieving an average score of 76.09 on the Open LLM Leaderboard, with notable performance in common sense reasoning and question answering. It is suitable for a variety of general-purpose language generation tasks.

Loading preview...

Model Overview

Gille/StrangeMerges_29-7B-dare_ties is a 7 billion parameter language model developed by Gille. It was created through a merge of two existing models, Gille/StrangeMerges_21-7B-slerp and CultriX/MonaTrix-v4, utilizing the dare_ties merge method within LazyMergekit.

Key Capabilities & Performance

This model exhibits solid performance across a range of benchmarks, as evaluated on the Open LLM Leaderboard. Its average score is 76.09, indicating strong general language understanding and generation abilities. Specific benchmark results include:

  • AI2 Reasoning Challenge (25-Shot): 73.04
  • HellaSwag (10-Shot): 89.04
  • MMLU (5-Shot): 64.29
  • TruthfulQA (0-shot): 76.98
  • Winogrande (5-shot): 84.53
  • GSM8k (5-shot): 68.69

These scores suggest proficiency in tasks requiring common sense reasoning, factual recall, and mathematical problem-solving.

Use Cases

Given its balanced performance across various benchmarks, StrangeMerges_29-7B-dare_ties is well-suited for general-purpose applications such as:

  • Text generation and completion
  • Question answering
  • Reasoning tasks
  • Content creation