Gille/StrangeMerges_37-7B-dare_ties

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Mar 14, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

Gille/StrangeMerges_37-7B-dare_ties is a 7 billion parameter language model created by Gille, built using a 'dare_ties' merge method. This model combines liminerity/M7-7b, Gille/StrangeMerges_30-7B-slerp, and ContextualAI/Contextual_KTO_Mistral_PairRM, resulting in a model with a 4096 token context length. It demonstrates balanced performance across various benchmarks, including reasoning, common sense, and language understanding tasks, making it suitable for general-purpose applications requiring a compact yet capable model.

Loading preview...

Model Overview

Gille/StrangeMerges_37-7B-dare_ties is a 7 billion parameter language model developed by Gille. It is constructed using the 'dare_ties' merge method, combining several base models: liminerity/M7-7b, Gille/StrangeMerges_30-7B-slerp, and ContextualAI/Contextual_KTO_Mistral_PairRM. This merging strategy aims to leverage the strengths of its constituent models.

Key Capabilities & Performance

This model exhibits a balanced performance profile across a range of benchmarks, as evaluated on the Open LLM Leaderboard. It achieves an average score of 70.44, with notable results including:

  • AI2 Reasoning Challenge (25-Shot): 70.31
  • HellaSwag (10-Shot): 86.82
  • MMLU (5-Shot): 59.40
  • TruthfulQA (0-shot): 75.23
  • Winogrande (5-shot): 81.85
  • GSM8k (5-shot): 49.05

Use Cases

Given its general-purpose performance across reasoning, common sense, and language understanding tasks, StrangeMerges_37-7B-dare_ties is suitable for applications requiring a versatile 7B parameter model. Its 4096 token context length supports moderate input sizes for various generative and analytical tasks.