ColorShadow-7B-v2: A Merged Language Model
ColorShadow-7B-v2 is a 7 billion parameter language model developed by nlpguy, built upon a Gradient-SLERP merge of two distinct models: diffnamehard/Mistral-CatMacaroni-slerp-7B and cookinai/Valkyrie-V1. This merging technique, performed using mergekit, aims to combine the strengths of its constituent models.
Key Capabilities & Performance
The model's performance has been evaluated on the Open LLM Leaderboard, achieving an overall average score of 66.88. Specific benchmark results include:
- AI2 Reasoning Challenge (25-Shot): 67.15
- HellaSwag (10-Shot): 84.69
- MMLU (5-Shot): 60.34
- TruthfulQA (0-shot): 62.93
- Winogrande (5-shot): 78.85
- GSM8k (5-shot): 47.31
These scores indicate a balanced capability across various reasoning, common sense, and language understanding tasks, with a strong performance in HellaSwag.
Merging Methodology
The model utilizes a Gradient-SLERP merge method, with specific t parameters applied to different layers (self_attn, mlp) to fine-tune the contribution of each base model. This approach allows for nuanced control over how features from the source models are integrated.
Good For
- General-purpose language understanding and generation tasks.
- Applications requiring strong common sense reasoning, as indicated by its HellaSwag score.
- Exploration of merged model architectures and their performance characteristics.