Name: RatanRohith/NeuralPizza-7B-Merge-Slerp API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: RatanRohith

NeuralPizza-7B-Merge-Slerp Overview

NeuralPizza-7B-Merge-Slerp is a 7 billion parameter model developed by RatanRohith. It is a product of merging two distinct models, RatanRohith/NeuralPizza-7B-V0.1 and RatanRohith/NeuralPizza-7B-V0.2, utilizing the slerp (spherical linear interpolation) merge method via mergekit.

Key Characteristics

Merge Method: Employs slerp to combine the weights of its base models, aiming for a balanced integration of their learned features.
Layer-Specific Merging: The merge configuration specifies different interpolation parameters (t values) for self-attention (self_attn) and multi-layer perceptron (mlp) layers, indicating a fine-tuned approach to combining these architectural components.
Base Models: Built upon RatanRohith/NeuralPizza-7B-V0.1 and RatanRohith/NeuralPizza-7B-V0.2, suggesting an evolution or combination of capabilities present in these prior versions.
Precision: The model uses bfloat16 dtype for its parameters, which is common for efficient large language model deployment.

Use Cases

This model is suitable for applications requiring a blend of the capabilities found in its constituent NeuralPizza-7B models. Its merged nature suggests potential for improved generalization or specialized performance derived from the combined strengths of V0.1 and V0.2.

Overview

NeuralPizza-7B-Merge-Slerp Overview

Key Characteristics

Use Cases

Full Model Card (README)