Stopwolf/Cerberus-7B-slerp

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jan 25, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

Cerberus-7B-slerp is a 7 billion parameter language model created by Stopwolf, formed by spherically interpolating (slerp) fblgit/UNA-TheBeagle-7b-v1 and UCLA-AGI/zephyr-7b-sft-full-SPIN-iter3. This merged model demonstrates strong general reasoning capabilities, achieving an average score of 63.46 on the Open LLM Leaderboard. It is suitable for tasks requiring robust understanding and generation, particularly excelling in areas like HellaSwag and Winogrande benchmarks.

Loading preview...

Cerberus-7B-slerp: A Merged 7B Language Model

Cerberus-7B-slerp is a 7 billion parameter model developed by Stopwolf, created through a spherical linear interpolation (slerp) merge of two distinct base models: fblgit/UNA-TheBeagle-7b-v1 and UCLA-AGI/zephyr-7b-sft-full-SPIN-iter3. This merging technique aims to combine the strengths of its constituent models.

Key Capabilities & Performance

The model's performance has been evaluated on the Open LLM Leaderboard, demonstrating solid general-purpose capabilities:

  • Average Score: 63.46
  • AI2 Reasoning Challenge (25-Shot): 69.54
  • HellaSwag (10-Shot): 87.33
  • MMLU (5-Shot): 63.25
  • TruthfulQA (0-shot): 61.35
  • Winogrande (5-shot): 81.29
  • GSM8k (5-shot): 17.97

These results indicate strong performance in common sense reasoning (HellaSwag, Winogrande) and general knowledge (MMLU, AI2 Reasoning Challenge).

When to Use This Model

Cerberus-7B-slerp is a good candidate for applications requiring a capable 7B model with balanced performance across various reasoning and language understanding tasks. Its strong scores in benchmarks like HellaSwag and Winogrande suggest suitability for tasks involving contextual understanding and disambiguation. While its GSM8k score indicates it may not be optimized for complex mathematical problem-solving, its overall performance makes it a versatile choice for general text generation and comprehension.