automerger/Experiment29Pastiche-7B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Mar 10, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

automerger/Experiment29Pastiche-7B is a 7 billion parameter language model created by Maxime Labonne through an automated merge of yam-peleg/Experiment29-7B and CorticalStack/pastiche-crown-clown-7b-dare. This model leverages a slerp merge method across specific layer ranges, with a 4096 token context length. Its unique construction via automated merging aims to combine the strengths of its constituent models, making it suitable for general text generation tasks.

Loading preview...

Experiment29Pastiche-7B: An Automated Merge Model

Experiment29Pastiche-7B is a 7 billion parameter language model developed by Maxime Labonne. This model is notable for its creation via an automated merge process, combining two distinct base models: yam-peleg/Experiment29-7B and CorticalStack/pastiche-crown-clown-7b-dare.

Key Capabilities & Technical Details

  • Automated Merging: The model is a product of a sophisticated slerp (spherical linear interpolation) merge method, applied across specific layer ranges (0 to 32) of its constituent models. This technique allows for a nuanced combination of features from both sources.
  • Configurable Merge Parameters: The merge configuration includes detailed parameter adjustments for self_attn and mlp layers, indicating a fine-tuned approach to integrating the base models' characteristics.
  • Base Model: The merge process used yam-peleg/Experiment29-7B as the base model, with CorticalStack/pastiche-crown-clown-7b-dare contributing to the merged architecture.
  • Parameter Count: It features 7 billion parameters, offering a balance between performance and computational efficiency.
  • Context Length: The model supports a context length of 4096 tokens, suitable for handling moderately long inputs and generating coherent responses.

Good For

  • General Text Generation: Given its merged nature, it is designed for a wide array of text generation tasks.
  • Experimentation with Merged Architectures: Developers interested in exploring the outcomes of automated model merging techniques will find this model particularly relevant.
  • Applications requiring a 7B parameter model: Its size makes it a viable option for deployment in environments where larger models might be too resource-intensive.