Gille/StrangeMerges_32-7B-slerp

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Mar 6, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Gille/StrangeMerges_32-7B-slerp is a 7 billion parameter language model created by Gille, resulting from a spherical linear interpolation (slerp) merge of Gille/StrangeMerges_31-7B-slerp and yam-peleg/Experiment28-7B. This model leverages the strengths of its constituent models through a specific layer-wise parameter weighting, aiming for a balanced performance across various tasks. It is designed for general-purpose text generation and understanding, suitable for applications requiring a compact yet capable model.

Loading preview...

Overview

Gille/StrangeMerges_32-7B-slerp is a 7 billion parameter language model developed by Gille. It is a product of a spherical linear interpolation (slerp) merge, combining two distinct models: Gille/StrangeMerges_31-7B-slerp and yam-peleg/Experiment28-7B. This merging technique, facilitated by LazyMergekit, allows for a nuanced blend of the source models' characteristics.

Key Characteristics

  • Merge Method: Utilizes spherical linear interpolation (slerp) to combine the weights of the base models.
  • Layer-wise Parameter Blending: Specific t values are applied to different filter types (self_attn, mlp) across layers, indicating a fine-tuned approach to weight distribution during the merge.
  • Base Model: The merge process is anchored around yam-peleg/Experiment28-7B as the base model.
  • Precision: Configured to use bfloat16 for its operations.

Use Cases

This model is suitable for general text generation tasks where a 7B parameter model offers a good balance between performance and computational efficiency. Its merged nature suggests a potential for diverse capabilities inherited from its parent models, making it adaptable for various natural language processing applications.