Gille/StrangeMerges_26-7B-dare_ties

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Feb 19, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

StrangeMerges_26-7B-dare_ties is a 7 billion parameter language model developed by Gille, created by merging paulml/OGNO-7B and Gille/StrangeMerges_25-7B-dare_ties using the DARE TIES method. This model achieves an average score of 76.19 on the Open LLM Leaderboard, demonstrating strong performance across various reasoning and language understanding benchmarks. With a 4096-token context length, it is suitable for general-purpose text generation and understanding tasks.

Loading preview...

Overview

StrangeMerges_26-7B-dare_ties is a 7 billion parameter language model developed by Gille. It was created through a merge of two existing models, paulml/OGNO-7B and Gille/StrangeMerges_25-7B-dare_ties, utilizing the DARE TIES merging method. The base model for this merge was Gille/StrangeMerges_21-7B-slerp.

Key Capabilities & Performance

This model demonstrates competitive performance on the Open LLM Leaderboard, achieving an average score of 76.19. Specific benchmark results include:

  • AI2 Reasoning Challenge (25-Shot): 72.95
  • HellaSwag (10-Shot): 89.00
  • MMLU (5-Shot): 64.35
  • TruthfulQA (0-shot): 76.39
  • Winogrande (5-shot): 84.45
  • GSM8k (5-shot): 69.98

Good For

  • General text generation and completion tasks.
  • Applications requiring strong reasoning and common sense understanding.
  • Use cases benefiting from a 7B parameter model with a 4096-token context window.
  • Developers looking for a merged model with a balanced performance profile across various benchmarks.