Name: tushar310/MisGemma-7B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: tushar310

MisGemma-7B: A Merged Language Model

MisGemma-7B is a 7 billion parameter language model developed by tushar310, created through the strategic merging of two distinct base models using mergekit:

EmbeddedLLM/Mistral-7B-Merge-14-v0.1
HuggingFaceH4/zephyr-7b-beta

Key Characteristics

This model utilizes a slerp (spherical linear interpolation) merge method to combine the weights of its constituent models. The merging process specifically applies varying interpolation values across different layers and components:

Self-attention layers: Interpolation values range from 0 to 1, with specific values at 0.5, 0.3, and 0.7.
MLP (Multi-Layer Perceptron) layers: Interpolation values range from 0 to 1, with specific values at 0.5, 0.7, and 0.3.
Other parameters: A default interpolation value of 0.5 is applied.

Intended Use

MisGemma-7B is designed to inherit and combine the capabilities of its base models, making it suitable for a broad range of natural language processing tasks. Its architecture, rooted in Mistral and Zephyr, suggests proficiency in areas such as:

General text generation
Conversational AI
Instruction following
Text summarization and analysis

The model's bfloat16 dtype configuration indicates an optimization for efficient inference while maintaining reasonable precision.

Overview

MisGemma-7B: A Merged Language Model

Key Characteristics

Intended Use

Full Model Card (README)