arcee-ai/Gemma-Zephyr-Dolly-Chat-Slerp

TEXT GENERATIONConcurrency Cost:1Model Size:8.5BQuant:FP8Ctx Length:8kPublished:Mar 3, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

Gemma-Zephyr-Dolly-Chat-Slerp is an 8.5 billion parameter language model created by arcee-ai, formed by merging HuggingFaceH4/zephyr-7b-gemma-v0.1 and google/gemma-7b+philschmid/gemma-7b-dolly-chatml using the slerp merge method. This model leverages the strengths of its constituent Gemma-based models, offering a combined capability for general language tasks. With an 8192 token context length, it is suitable for applications requiring moderate context understanding.

Loading preview...

Model Overview

Gemma-Zephyr-Dolly-Chat-Slerp is an 8.5 billion parameter language model developed by arcee-ai. It is a product of merging two distinct Gemma-based models: HuggingFaceH4/zephyr-7b-gemma-v0.1 and google/gemma-7b+philschmid/gemma-7b-dolly-chatml. This merge was performed using the slerp (Spherical Linear Interpolation) method via the mergekit tool.

Key Characteristics

  • Merged Architecture: Combines the capabilities of a Zephyr-Gemma variant and a Gemma model fine-tuned with Dolly-ChatML data.
  • Parameter Count: Features approximately 8.5 billion parameters, offering a balance between performance and computational requirements.
  • Merge Method: Utilizes the slerp method, which is designed to blend model weights effectively, particularly for different layers (self_attn and mlp) with varying interpolation values.

Intended Use Cases

This model is designed for general-purpose language generation and understanding tasks, benefiting from the combined training of its base models. Its architecture suggests suitability for applications where a blend of instruction-following and conversational capabilities derived from its merged components is advantageous.