s3nh/Severusectum-7B-DPO

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Feb 3, 2024License:mitArchitecture:Transformer0.0K Open Weights Cold

s3nh/Severusectum-7B-DPO is a 7 billion parameter language model created by s3nh, formed by merging FelixChao/Sectumsempra-7B-DPO and FelixChao/WestSeverus-7B-DPO-v2 using the SLERP method. This model achieves an average score of 75.18 on the Open LLM Leaderboard, demonstrating capabilities across reasoning, common sense, and language understanding tasks. With a 4096-token context length, it is suitable for general-purpose applications requiring robust performance in its size class.

Loading preview...

Severusectum-7B-DPO: A Merged Language Model

s3nh/Severusectum-7B-DPO is a 7 billion parameter language model developed by s3nh, leveraging the SLERP merge method to combine two distinct DPO-tuned models: FelixChao/Sectumsempra-7B-DPO and FelixChao/WestSeverus-7B-DPO-v2. This merging strategy aims to synthesize the strengths of its constituent models into a single, cohesive unit.

Key Capabilities & Performance

This model demonstrates solid performance across a range of benchmarks, as evaluated on the Open LLM Leaderboard. It achieved an average score of 75.18, with notable results in specific areas:

  • AI2 Reasoning Challenge (25-Shot): 71.50
  • HellaSwag (10-Shot): 88.55
  • MMLU (5-Shot): 64.79
  • TruthfulQA (0-shot): 72.45
  • Winogrande (5-shot): 83.27
  • GSM8k (5-shot): 70.51

These scores indicate proficiency in reasoning, common sense, factual recall, and mathematical problem-solving. The model operates with a context length of 4096 tokens.

Good For

  • General-purpose text generation and understanding: Its balanced performance across various benchmarks makes it suitable for a wide array of tasks.
  • Applications requiring robust reasoning and common sense: Demonstrated by its scores on ARC and HellaSwag.
  • Developers seeking a merged model: Offers a unique combination of capabilities derived from its merged components.