brgx53/3Blarenegv3-ECE-PRYMMAL-Martial
brgx53/3Blarenegv3-ECE-PRYMMAL-Martial is a 7.6 billion parameter language model created by brgx53 using the SLERP merge method. This model combines fblgit/cybertron-v4-qw7B-MGS and Tsunami-th/Tsunami-0.5x-7B-Instruct, featuring a 131072 token context length. It is designed for general-purpose applications, demonstrating balanced performance across various benchmarks including IFEval, BBH, and MMLU-PRO.
Loading preview...
Overview
brgx53/3Blarenegv3-ECE-PRYMMAL-Martial is a 7.6 billion parameter language model developed by brgx53. It was created using the SLERP merge method, combining two base models: fblgit/cybertron-v4-qw7B-MGS and Tsunami-th/Tsunami-0.5x-7B-Instruct. The merge configuration specifically applied varying interpolation values across self-attention and MLP layers, with a general value of 0.5, and utilizes bfloat16 for its dtype.
Key Capabilities
This merged model demonstrates a balanced performance profile across several benchmarks, as evaluated on the Open LLM Leaderboard:
- IFEval (0-Shot): Achieves 56.77, indicating proficiency in instruction following.
- BBH (3-Shot): Scores 37.25, reflecting its ability in Big-Bench Hard tasks.
- MATH Lvl 5 (4-Shot): Reaches 30.74, showing some capability in mathematical reasoning.
- MMLU-PRO (5-shot): Scores 38.95, suggesting general knowledge and understanding across various subjects.
Good For
Given its balanced performance across diverse benchmarks, this model is suitable for:
- General-purpose applications requiring a blend of instruction following, reasoning, and knowledge recall.
- Experimentation with merged models for developers interested in the SLERP method and its outcomes.
- Use cases where a 7.6 billion parameter model with a 131072 token context length fits the computational and performance requirements.