Name: harshitv804/MetaMath-Mistral-2x7B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: harshitv804

Model Overview

harshitv804/MetaMath-Mistral-2x7B is an experimental 7-billion parameter Mixture of Experts (MoE) model, developed by harshitv804. It is built upon the Mistral architecture and specifically utilizes the meta-math/MetaMath-Mistral-7B as its base model. The primary purpose of this model is for experimental and learning exploration of MoE architectures.

Merge Details

This model was created using the mergekit tool, employing the SLERP (Spherical Linear Interpolation) merge method. Two instances of the meta-math/MetaMath-Mistral-7B model were merged to form this MoE configuration. The merge process involved specific parameter weighting for self-attention and MLP layers, as detailed in the provided YAML configuration.

Key Capabilities

Mathematical Reasoning: Inherits strong mathematical problem-solving capabilities from its MetaMath-Mistral-7B base.
Mixture of Experts Architecture: Provides a practical example and platform for understanding and experimenting with MoE models.

Intended Use

This model is suitable for researchers and developers interested in:

Exploring the behavior and performance of Mixture of Experts models.
Benchmarking mathematical reasoning tasks with an MoE-based approach.
Learning about model merging techniques, specifically SLERP, for creating custom LLMs.

Overview

Model Overview

Merge Details

Key Capabilities

Intended Use

Full Model Card (README)