Gille/StrangeMerges_16-7B-slerp

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jan 31, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Gille/StrangeMerges_16-7B-slerp is a 7 billion parameter language model created by Gille, formed by merging Gille/StrangeMerges_15-7B-slerp and SanjiWatsuki/Kunoichi-7B using a slerp method. This model features a 4096-token context length and achieves an average score of 72.80 on the Open LLM Leaderboard, demonstrating strong general reasoning and language understanding capabilities. It is suitable for a variety of general-purpose natural language processing tasks.

Loading preview...

Overview

Gille/StrangeMerges_16-7B-slerp is a 7 billion parameter language model developed by Gille. It is a product of a sophisticated merge operation, combining the strengths of two base models: Gille/StrangeMerges_15-7B-slerp and SanjiWatsuki/Kunoichi-7B. This merge was performed using the slerp (spherical linear interpolation) method, with specific parameter weightings applied to self-attention and MLP layers to optimize performance. The model supports a context length of 4096 tokens.

Key Capabilities & Performance

This model demonstrates robust performance across various benchmarks, as evaluated on the Open LLM Leaderboard. It achieved an average score of 72.80, indicating strong general language understanding and reasoning. Specific benchmark results include:

  • AI2 Reasoning Challenge (25-Shot): 69.03
  • HellaSwag (10-Shot): 87.15
  • MMLU (5-Shot): 65.65
  • TruthfulQA (0-shot): 62.97
  • Winogrande (5-shot): 81.29
  • GSM8k (5-shot): 70.74

Good For

  • General-purpose natural language processing tasks requiring solid reasoning and language comprehension.
  • Applications where a 7B parameter model with a 4096-token context window is suitable for balancing performance and computational resources.
  • Users looking for a model with a balanced performance profile across diverse benchmarks, including common sense reasoning, reading comprehension, and mathematical problem-solving.