arcee-ai/gemma-7b-slerp

TEXT GENERATIONConcurrency Cost:1Model Size:8.5BQuant:FP8Ctx Length:8kPublished:Feb 27, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

arcee-ai/gemma-7b-slerp is an 8.5 billion parameter language model created by arcee-ai, formed by merging Google's Gemma 7B base and 7B-Instruct models using the Slerp method. This model leverages the strengths of both base and instruction-tuned Gemma variants, offering a balanced performance profile. It is suitable for general-purpose language tasks, including instruction following and text generation, with a context length of 8192 tokens.

Loading preview...

Model Overview

arcee-ai/gemma-7b-slerp is an 8.5 billion parameter language model derived from Google's Gemma architecture. It is a merged model, combining the google/gemma-7b base model and the google/gemma-7b-it instruction-tuned variant. This merge was performed using the Slerp (Spherical Linear Interpolation) method via mergekit, aiming to integrate the capabilities of both source models.

Key Characteristics

  • Architecture: Based on Google's Gemma family.
  • Parameter Count: 8.5 billion parameters.
  • Merging Method: Utilizes the Slerp method, which interpolates between model weights, often resulting in a blend of their respective strengths.
  • Context Length: Supports a context window of 8192 tokens.

Performance Insights

Evaluations using the Nous benchmark suite, performed with LLM AutoEval, show the following scores:

  • Average: 34.14
  • AGIEval: 23.86
  • GPT4All: 36.55
  • TruthfulQA: 46.22
  • Bigbench: 29.94

Use Cases

This model is well-suited for a variety of general language generation and instruction-following tasks, benefiting from the combined characteristics of a strong base model and an instruction-tuned variant. Its balanced performance makes it a versatile choice for applications requiring both foundational language understanding and the ability to follow specific instructions.