Gille/StrangeMerges_17-7B-dare_ties

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jan 31, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Gille/StrangeMerges_17-7B-dare_ties is a 7 billion parameter language model created by Gille, formed by merging Gille/StrangeMerges_16-7B-slerp and Gille/StrangeMerges_12-7B-slerp using the dare_ties method. This model achieves an average score of 69.54 on the Open LLM Leaderboard, demonstrating strong general reasoning and language understanding across various benchmarks. With a 4096-token context length, it is suitable for a range of general-purpose text generation and comprehension tasks.

Loading preview...

Overview

Gille/StrangeMerges_17-7B-dare_ties is a 7 billion parameter language model developed by Gille. It is a product of merging two existing models, Gille/StrangeMerges_16-7B-slerp and Gille/StrangeMerges_12-7B-slerp, utilizing the dare_ties merge method. This approach combines the strengths of its constituent models to create a new, potentially more capable iteration.

Key Capabilities

  • General Language Understanding: Achieves competitive scores on standard benchmarks, indicating proficiency in various language tasks.
  • Reasoning: Demonstrated performance on reasoning tasks like AI2 Reasoning Challenge and GSM8k.
  • Context Handling: Supports a context length of 4096 tokens, allowing for processing moderately long inputs.

Performance Benchmarks

The model's performance is evaluated on the Open LLM Leaderboard, with the following key results:

  • Average Score: 69.54
  • AI2 Reasoning Challenge (25-Shot): 66.64
  • HellaSwag (10-Shot): 86.04
  • MMLU (5-Shot): 65.07
  • TruthfulQA (0-shot): 53.18
  • Winogrande (5-shot): 81.93
  • GSM8k (5-shot): 64.37

Good For

This model is suitable for developers looking for a 7B parameter model with solid general-purpose capabilities, particularly for tasks requiring reasoning and broad language understanding, as indicated by its benchmark scores.