Kukedlc/NeuralKrishna-7B-v3

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Mar 7, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

NeuralKrishna-7B-v3 is a 7 billion parameter language model developed by Kukedlc, created through a merge of NeuralGlitch-Yam-Peleg-7B-DT, Fasciculus-Arcuatus-7B-slerp, and Neural4gsm8k using the DARE TIES merge method. This model is built upon the mlabonne/Monarch-7B base and is designed for general text generation tasks, leveraging its merged architecture for potentially enhanced performance across various domains. It supports a context length of 4096 tokens, making it suitable for applications requiring moderate input and output lengths.

Loading preview...

NeuralKrishna-7B-v3 Overview

NeuralKrishna-7B-v3 is a 7 billion parameter language model developed by Kukedlc, constructed by merging three distinct models: NeuralGlitch-Yam-Peleg-7B-DT, Fasciculus-Arcuatus-7B-slerp, and Neural4gsm8k. This merge was performed using the DARE TIES method, with mlabonne/Monarch-7B serving as the base model.

Key Characteristics

  • Merged Architecture: Combines specific strengths of its constituent models through a weighted merge process, aiming for a balanced performance profile.
  • Parameter Efficient: At 7 billion parameters, it offers a balance between performance and computational resource requirements.
  • Configuration Details: The merge configuration specifies density and weight parameters for each contributing model, along with an int8_mask and bfloat16 data type for optimized inference.

Good For

  • General Text Generation: Suitable for a wide array of natural language processing tasks, including question answering, content creation, and conversational AI.
  • Exploration of Merged Models: Provides a practical example of a model created via advanced merging techniques like DARE TIES, useful for researchers and developers interested in model fusion.