jsfs11/RandomMergeNoNormWEIGHTED-7B-DARETIES
jsfs11/RandomMergeNoNormWEIGHTED-7B-DARETIES is a 7 billion parameter language model created by jsfs11, formed by merging FelixChao/WestSeverus-7B-DPO-v2, CultriX/Wernicke-7B-v9, and mlabonne/NeuralBeagle14-7B using the TIES merging method. This model leverages a weighted density configuration for its constituent models, with a base context length of 4096 tokens. It is designed to combine the strengths of its merged components, achieving an average score of 75.36 on the Open LLM Leaderboard.
Loading preview...
Model Overview
jsfs11/RandomMergeNoNormWEIGHTED-7B-DARETIES is a 7 billion parameter language model developed by jsfs11. It is a product of merging three distinct models: FelixChao/WestSeverus-7B-DPO-v2, CultriX/Wernicke-7B-v9, and mlabonne/NeuralBeagle14-7B. The merging process was conducted using the TIES (Trimmed, Iterative, and Selective) merging method via mergekit, which allows for a nuanced combination of model weights and densities.
Key Characteristics
- Merged Architecture: Combines the strengths of three different 7B models.
- Weighted Density Configuration: Utilizes specific density and weight parameters for each merged model, including
int8_maskandsparsifysettings for MLP and self-attention layers. - Performance: Achieves an average score of 75.36 on the Open LLM Leaderboard, with notable scores such as 88.50 on HellaSwag and 73.38 on AI2 Reasoning Challenge.
Performance Highlights
- AI2 Reasoning Challenge (25-Shot): 73.38
- HellaSwag (10-Shot): 88.50
- MMLU (5-Shot): 64.94
- TruthfulQA (0-shot): 71.50
- Winogrande (5-shot): 83.58
- GSM8k (5-shot): 70.28
This model is suitable for tasks requiring a balanced performance across various benchmarks, benefiting from the diverse capabilities of its constituent models.