Kukedlc/NeuralKrishna-7B-slerp

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Feb 18, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

NeuralKrishna-7B-slerp is a 7 billion parameter language model developed by Kukedlc, created by merging Neural4gsm8k and NeuralMaxime-7B-slerp using a slerp method. This model is designed for general language tasks, demonstrating strong performance across various benchmarks including reasoning, common sense, and mathematical problem-solving. With a context length of 4096 tokens, it is suitable for applications requiring robust understanding and generation capabilities.

Loading preview...

NeuralKrishna-7B-slerp Overview

NeuralKrishna-7B-slerp is a 7 billion parameter language model developed by Kukedlc. It is a product of merging two distinct models, Kukedlc/Neural4gsm8k and Kukedlc/NeuralMaxime-7B-slerp, utilizing the slerp (spherical linear interpolation) merge method. This approach combines the strengths of its constituent models to enhance overall performance.

Key Capabilities

  • Reasoning: Achieves 73.46% on the AI2 Reasoning Challenge (25-Shot).
  • Common Sense: Scores 88.96% on HellaSwag (10-Shot) and 83.27% on Winogrande (5-Shot).
  • General Knowledge: Demonstrates 64.62% on MMLU (5-Shot).
  • Factuality: Attains 74.29% on TruthfulQA (0-shot).
  • Mathematical Problem Solving: Performs well on GSM8k (5-shot) with a score of 70.13%.

Good For

  • Applications requiring a balanced performance across various language understanding and generation tasks.
  • Scenarios where a merged model approach is beneficial for combining specific model strengths.
  • General-purpose text generation and conversational AI within its 4096-token context window.