s3nh/Severusectum-7B-DPO
s3nh/Severusectum-7B-DPO is a 7 billion parameter language model created by s3nh, formed by merging FelixChao/Sectumsempra-7B-DPO and FelixChao/WestSeverus-7B-DPO-v2 using the SLERP method. This model achieves an average score of 75.18 on the Open LLM Leaderboard, demonstrating capabilities across reasoning, common sense, and language understanding tasks. With a 4096-token context length, it is suitable for general-purpose applications requiring robust performance in its size class.
Loading preview...
Severusectum-7B-DPO: A Merged Language Model
s3nh/Severusectum-7B-DPO is a 7 billion parameter language model developed by s3nh, leveraging the SLERP merge method to combine two distinct DPO-tuned models: FelixChao/Sectumsempra-7B-DPO and FelixChao/WestSeverus-7B-DPO-v2. This merging strategy aims to synthesize the strengths of its constituent models into a single, cohesive unit.
Key Capabilities & Performance
This model demonstrates solid performance across a range of benchmarks, as evaluated on the Open LLM Leaderboard. It achieved an average score of 75.18, with notable results in specific areas:
- AI2 Reasoning Challenge (25-Shot): 71.50
- HellaSwag (10-Shot): 88.55
- MMLU (5-Shot): 64.79
- TruthfulQA (0-shot): 72.45
- Winogrande (5-shot): 83.27
- GSM8k (5-shot): 70.51
These scores indicate proficiency in reasoning, common sense, factual recall, and mathematical problem-solving. The model operates with a context length of 4096 tokens.
Good For
- General-purpose text generation and understanding: Its balanced performance across various benchmarks makes it suitable for a wide array of tasks.
- Applications requiring robust reasoning and common sense: Demonstrated by its scores on ARC and HellaSwag.
- Developers seeking a merged model: Offers a unique combination of capabilities derived from its merged components.