Name: Joseph717171/Llama-3.1-SuperNova-8B-Lite_TIES_with_Base API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Joseph717171

Model Overview

Joseph717171/Llama-3.1-SuperNova-8B-Lite_TIES_with_Base is an 8 billion parameter language model derived from the Llama-3.1 family. It was created by Joseph717171 using the mergekit tool, specifically employing the TIES (Trimming and Expanding the Subnetwork) merge method.

Key Differentiator: Enhanced Instruction Following

This model's primary innovation lies in its merge strategy. It combines arcee-ai/Llama-3.1-SuperNova-Lite with its base model, meta-llama/Llama-3.1-8B. The developer, Joseph717171, refined the TIES merge by incorporating the density parameter alongside weight, a technique inspired by successful merges like RomboDawg's Replete-AI. This approach was crucial for restoring and improving the instruction-following capabilities that can sometimes be diminished in merged models.

Merge Details

The TIES merge was performed with a weight of 1 and density of 1 for the instruct model relative to the base. Post-merge, the configuration files were replaced with those of the original instruct model to ensure consistent behavior.

Performance Metrics

Evaluations on the Open LLM Leaderboard show the following results:

Average Score: 43.07
IFEval (0-Shot): 80.96
BBH (3-Shot): 51.10
MATH Lvl 5 (4-Shot): 15.56
GPQA (0-shot): 30.96
MuSR (0-shot): 41.01
MMLU-PRO (5-shot): 38.80

These scores indicate a solid performance across various benchmarks, particularly in instruction-following tasks (IFEval).