arcee-ai/saul-mistral-v0.2-7b-slerp

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Mar 25, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

arcee-ai/saul-mistral-v0.2-7b-slerp is a 7 billion parameter language model created by arcee-ai, formed by merging Equall/Saul-Base and mistralai/Mistral-7B-Instruct-v0.2 using a slerp merge method. This model leverages the strengths of both base models, specifically combining their self-attention and MLP layers with varying interpolation values. It is designed to offer a balanced performance profile derived from its constituent models, suitable for general-purpose instruction-following tasks.

Loading preview...

Model Overview

arcee-ai/saul-mistral-v0.2-7b-slerp is a 7 billion parameter language model developed by arcee-ai. This model is a product of merging two distinct base models: Equall/Saul-Base and mistralai/Mistral-7B-Instruct-v0.2. The merge was performed using the slerp (spherical linear interpolation) method via mergekit, a tool for combining neural network weights.

Key Characteristics

  • Merged Architecture: Combines the Equall/Saul-Base and mistralai/Mistral-7B-Instruct-v0.2 models.
  • Slerp Merge Method: Utilizes spherical linear interpolation to blend the weights of the constituent models, aiming for a synergistic combination of their capabilities.
  • Parameter Blending: Specific t values were applied to different layers during the merge process, with self-attention layers and MLP layers receiving distinct interpolation weights (e.g., self_attn values ranging from 0 to 1, mlp values ranging from 1 to 0, and a general 0.5 for other parameters).
  • Base Model: The mistralai/Mistral-7B-Instruct-v0.2 served as the foundational base model for the merge.
  • Precision: The model was processed using bfloat16 data type.

Potential Use Cases

This merged model is likely suitable for a range of applications where the combined strengths of its base models are beneficial. Given its origins, it can be expected to perform well in:

  • General instruction-following and conversational AI.
  • Text generation and summarization tasks.
  • Applications requiring a balance of reasoning and creative capabilities, inherited from its parent models.