ftamas/Dark-Science-12B
ftamas/Dark-Science-12B is a 12 billion parameter language model created by ftamas through a SLERP merge of Khetterman/DarkAtom-12B-v3 and Khetterman/AbominationScience-12B-v4. This model combines the characteristics of its constituent models, offering a blended performance profile. It is suitable for applications requiring a merged model with a 32768 token context length.
Loading preview...
Model Overview
ftamas/Dark-Science-12B is a 12 billion parameter language model resulting from a SLERP merge of two pre-trained models: Khetterman/DarkAtom-12B-v3 and Khetterman/AbominationScience-12B-v4. This merging process, conducted using mergekit, aims to combine the strengths and characteristics of the source models into a single, cohesive unit.
Merge Details
- Merge Method: The model was created using the SLERP (Spherical Linear Interpolation) merge method, which is known for producing stable and effective blends of model weights.
- Constituent Models: It integrates layers from both
DarkAtom-12B-v3andAbominationScience-12B-v4, specifically layers 0 through 40 from each. - Configuration: The merge applied specific interpolation parameters (
tvalues) to the self-attention and MLP blocks, indicating a fine-tuned approach to weight blending rather than a simple average.
Potential Use Cases
This merged model is designed for users looking for a unique blend of capabilities derived from its base models. It offers a 32768 token context length, making it suitable for tasks requiring extensive contextual understanding. Developers can leverage this model for applications where the combined strengths of DarkAtom-12B-v3 and AbominationScience-12B-v4 are beneficial, without needing to run multiple models.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.