PistachioAlt/Noromaid-Bagel-7B-Slerp
PistachioAlt/Noromaid-Bagel-7B-Slerp is a 7 billion parameter language model created by PistachioAlt, developed using a Slerp merge of jondurbin/bagel-dpo-7b-v0.1 and NeverSleep/Noromaid-7b-v0.1.1. This model leverages the strengths of its base components, with specific parameter weighting applied to self-attention and MLP layers. It is designed for general language tasks, combining the capabilities of its merged predecessors.
Loading preview...
Noromaid-Bagel-7B-Slerp: A Merged Language Model
PistachioAlt/Noromaid-Bagel-7B-Slerp is a 7 billion parameter model created through a Slerp (Spherical Linear Interpolation) merge of two distinct base models: jondurbin/bagel-dpo-7b-v0.1 and NeverSleep/Noromaid-7b-v0.1.1. This merging technique allows for a nuanced combination of the characteristics and strengths of its constituent models.
Key Merging Details
The Slerp merge method was applied to combine the full layer ranges of both base models. A specific parameter weighting scheme was used to influence the merge, with varying values applied to different components:
- Self-Attention Layers: The
tparameter for self-attention layers was set to[0, 0.5, 0.3, 0.7, 1], indicating a non-uniform blend across these layers. - MLP Layers: For the Multi-Layer Perceptron (MLP) layers, the
tparameter was set to[1, 0.5, 0.7, 0.3, 0], also suggesting a tailored integration. - General Parameters: A default
tvalue of0.3was applied to other parameters not covered by the specific filters.
This precise merging strategy aims to create a model that inherits beneficial traits from both bagel-dpo-7b-v0.1 and Noromaid-7b-v0.1.1, potentially offering improved performance or a unique blend of capabilities for general-purpose language generation and understanding tasks.