automerger/Experiment27Neuralsirkrishna-7B
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Mar 11, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

automerger/Experiment27Neuralsirkrishna-7B is a 7 billion parameter language model created by Maxime Labonne through an automated merge of yam-peleg/Experiment27-7B and Kukedlc/NeuralSirKrishna-7b. This model leverages a slerp merge method with specific parameter weighting for self-attention and MLP layers, aiming to combine the strengths of its constituent models. It is designed for general text generation tasks, offering a 4096 token context length.

Loading preview...

Model Overview

automerger/Experiment27Neuralsirkrishna-7B is a 7 billion parameter language model, an automated merge orchestrated by Maxime Labonne. It is constructed by combining two base models, yam-peleg/Experiment27-7B and Kukedlc/NeuralSirKrishna-7b, using a slerp (spherical linear interpolation) merge method.

Key Characteristics

  • Automated Merge: Created via an automated process, indicating a systematic approach to model combination.
  • Slerp Method: Utilizes the slerp merge method, which is often employed for smoothly interpolating between model weights.
  • Layer-Specific Weighting: The merge configuration applies distinct weighting values (t parameter) to self-attention and MLP layers, suggesting an optimization strategy to balance the contributions of the merged models in different architectural components.
  • Bfloat16 Precision: The model is configured to use bfloat16 data type, which is common for efficient inference on modern hardware.

Good For

  • General Text Generation: Suitable for a wide range of text generation tasks, benefiting from the combined capabilities of its base models.
  • Experimentation with Merged Models: Provides a practical example of how automated merging techniques can be applied to create new models from existing ones.