Model Overview
Joseph717171/Llama-3.1-SuperNova-8B-Lite_TIES_with_Base is an 8 billion parameter language model derived from the Llama-3.1 family. It was created by Joseph717171 using the mergekit tool, specifically employing the TIES (Trimming and Expanding the Subnetwork) merge method.
Key Differentiator: Enhanced Instruction Following
This model's primary innovation lies in its merge strategy. It combines arcee-ai/Llama-3.1-SuperNova-Lite with its base model, meta-llama/Llama-3.1-8B. The developer, Joseph717171, refined the TIES merge by incorporating the density parameter alongside weight, a technique inspired by successful merges like RomboDawg's Replete-AI. This approach was crucial for restoring and improving the instruction-following capabilities that can sometimes be diminished in merged models.
Merge Details
The TIES merge was performed with a weight of 1 and density of 1 for the instruct model relative to the base. Post-merge, the configuration files were replaced with those of the original instruct model to ensure consistent behavior.
Performance Metrics
Evaluations on the Open LLM Leaderboard show the following results:
- Average Score: 43.07
- IFEval (0-Shot): 80.96
- BBH (3-Shot): 51.10
- MATH Lvl 5 (4-Shot): 15.56
- GPQA (0-shot): 30.96
- MuSR (0-shot): 41.01
- MMLU-PRO (5-shot): 38.80
These scores indicate a solid performance across various benchmarks, particularly in instruction-following tasks (IFEval).