redrix/sororicide-12B-Farer-Mell-Unslop
The redrix/sororicide-12B-Farer-Mell-Unslop is a 12 billion parameter language model, merged by redrix using the NuSLERP method, with a context length of 32768 tokens. It combines components from Wayfarer-12B and MN-12B-Mag-Mell-R1, based on TheDrummer/UnslopNemo-12B-v4. This model is designed to leverage the strengths of its constituent models, offering a unique blend of capabilities for general language generation tasks.
Loading preview...
Model Overview
redrix/sororicide-12B-Farer-Mell-Unslop is a 12 billion parameter language model created by redrix. It was developed using the mergekit tool, specifically employing the NuSLERP merge method. The model is built upon TheDrummer/UnslopNemo-12B-v4 as its base, integrating components from two distinct 12B models: inflatebot/MN-12B-Mag-Mell-R1 and LatitudeGames/Wayfarer-12B.
Merge Details
The merge process involved a specific configuration to blend the weights of the constituent models. The self_attn and mlp layers of LatitudeGames/Wayfarer-12B and inflatebot/MN-12B-Mag-Mell-R1 were weighted differently, with a general weighting applied to other parameters. The chat_template is set to "chatml", and the tokenizer uses a union source. The merge was performed with bfloat16 dtype, and includes parameter normalization and int8 masking.
Key Characteristics
- Architecture: A merged model combining
Wayfarer-12BandMag-Mell-R1on anUnslopNemo-12B-v4base. - Parameter Count: 12 billion parameters.
- Context Length: Supports a context window of 32768 tokens.
- Merge Method: Utilizes the NuSLERP method for combining model weights.
Potential Use Cases
This model is suitable for general-purpose language generation, given its foundation in multiple pre-trained models. Its merged nature suggests a balanced performance across various tasks, making it a versatile option for developers exploring different language applications.