arcee-ai/Saul-Base-Clown-7B-Instruct-slerp

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Mar 30, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

The arcee-ai/Saul-Base-Clown-7B-Instruct-slerp is a 7 billion parameter instruction-tuned language model created by arcee-ai. This model is a merge of Equall/Saul-Base and CorticalStack/pastiche-crown-clown-7b-dare-dpo, utilizing the slerp merge method. It is designed to combine the strengths of its constituent models, offering a versatile base for various natural language processing tasks. With a context length of 4096 tokens, it provides a balanced performance for general-purpose conversational AI and instruction following.

Loading preview...

Model Overview

The arcee-ai/Saul-Base-Clown-7B-Instruct-slerp is a 7 billion parameter instruction-tuned language model developed by arcee-ai. This model is a product of merging two distinct base models: Equall/Saul-Base and CorticalStack/pastiche-crown-clown-7b-dare-dpo. The merge was performed using the slerp (Spherical Linear Interpolation) method via mergekit, aiming to combine and balance the characteristics of its parent models.

Key Characteristics

  • Merged Architecture: Leverages the strengths of two different 7B parameter models, Equall/Saul-Base and CorticalStack/pastiche-crown-clown-7b-dare-dpo.
  • Slerp Merge Method: Utilizes Spherical Linear Interpolation for merging, which can lead to a more harmonious combination of model weights compared to simpler averaging methods.
  • Instruction-Tuned: Designed to follow instructions effectively, making it suitable for a wide range of interactive and task-oriented applications.
  • Parameter Configuration: The merge configuration specifies different interpolation values (t) for self-attention and MLP layers, indicating a fine-tuned approach to weight blending.
  • Bfloat16 Precision: The model operates in bfloat16 data type, offering a balance between performance and memory efficiency.

Use Cases

This model is well-suited for developers looking for a 7B instruction-tuned model that benefits from the combined capabilities of its merged components. It can be applied to:

  • General-purpose conversational AI.
  • Instruction following tasks.
  • Text generation and summarization.
  • As a base for further fine-tuning on specific downstream applications.