nlpguy/ColorShadow-7B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Dec 30, 2023License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

ColorShadow-7B is a 7 billion parameter language model developed by nlpguy, created through a Gradient-SLERP merge of diffnamehard/Mistral-CatMacaroni-slerp-7B and cookinai/Valkyrie-V1. This model leverages the Mistral architecture and is optimized for general reasoning and language understanding tasks, achieving an average score of 68.34 on the Open LLM Leaderboard. Its merging methodology aims to combine the strengths of its base models for balanced performance across various benchmarks.

Loading preview...

Overview

ColorShadow-7B is a 7 billion parameter language model developed by nlpguy. It is a product of a Gradient-SLERP merge, combining the capabilities of diffnamehard/Mistral-CatMacaroni-slerp-7B and cookinai/Valkyrie-V1 using the mergekit tool. This merging technique, specifically using a slerp method with varying t parameters for self-attention and MLP layers, aims to create a balanced model by blending the characteristics of its constituent models.

Key Capabilities

  • General Reasoning: Achieves 67.83% on the AI2 Reasoning Challenge (25-Shot).
  • Common Sense Inference: Scores 85.15% on HellaSwag (10-Shot) and 80.58% on Winogrande (5-Shot).
  • Knowledge & Understanding: Demonstrates 61.69% on MMLU (5-Shot).
  • Mathematical Reasoning: Performs at 55.19% on GSM8k (5-Shot).
  • Factuality: Scores 59.56% on TruthfulQA (0-shot).

Performance Highlights

Evaluated on the Open LLM Leaderboard, ColorShadow-7B achieved an average score of 68.34. Detailed results are available here, showcasing its balanced performance across a range of academic benchmarks.

Good for

  • Applications requiring a general-purpose language model with solid reasoning and common-sense abilities.
  • Use cases where a merged model approach is preferred for combining specific strengths of different base models.