nlpguy/ColorShadow-7B-v2
ColorShadow-7B-v2 is a 7 billion parameter language model developed by nlpguy, created through a Gradient-SLERP merge of diffnamehard/Mistral-CatMacaroni-slerp-7B and cookinai/Valkyrie-V1. This model demonstrates an average performance of 66.88 on the Open LLM Leaderboard, with notable scores in HellaSwag (84.69) and AI2 Reasoning Challenge (67.15). It is designed for general language understanding and reasoning tasks, leveraging its merged architecture for balanced capabilities.
Loading preview...
ColorShadow-7B-v2: A Merged Language Model
ColorShadow-7B-v2 is a 7 billion parameter language model developed by nlpguy, built upon a Gradient-SLERP merge of two distinct models: diffnamehard/Mistral-CatMacaroni-slerp-7B and cookinai/Valkyrie-V1. This merging technique, performed using mergekit, aims to combine the strengths of its constituent models.
Key Capabilities & Performance
The model's performance has been evaluated on the Open LLM Leaderboard, achieving an overall average score of 66.88. Specific benchmark results include:
- AI2 Reasoning Challenge (25-Shot): 67.15
- HellaSwag (10-Shot): 84.69
- MMLU (5-Shot): 60.34
- TruthfulQA (0-shot): 62.93
- Winogrande (5-shot): 78.85
- GSM8k (5-shot): 47.31
These scores indicate a balanced capability across various reasoning, common sense, and language understanding tasks, with a strong performance in HellaSwag.
Merging Methodology
The model utilizes a Gradient-SLERP merge method, with specific t parameters applied to different layers (self_attn, mlp) to fine-tune the contribution of each base model. This approach allows for nuanced control over how features from the source models are integrated.
Good For
- General-purpose language understanding and generation tasks.
- Applications requiring strong common sense reasoning, as indicated by its HellaSwag score.
- Exploration of merged model architectures and their performance characteristics.