Model Overview
redrix/sororicide-12B-Farer-Mell-Unslop is a 12 billion parameter language model created by redrix. It was developed using the mergekit tool, specifically employing the NuSLERP merge method. The model is built upon TheDrummer/UnslopNemo-12B-v4 as its base, integrating components from two distinct 12B models: inflatebot/MN-12B-Mag-Mell-R1 and LatitudeGames/Wayfarer-12B.
Merge Details
The merge process involved a specific configuration to blend the weights of the constituent models. The self_attn and mlp layers of LatitudeGames/Wayfarer-12B and inflatebot/MN-12B-Mag-Mell-R1 were weighted differently, with a general weighting applied to other parameters. The chat_template is set to "chatml", and the tokenizer uses a union source. The merge was performed with bfloat16 dtype, and includes parameter normalization and int8 masking.
Key Characteristics
- Architecture: A merged model combining
Wayfarer-12B and Mag-Mell-R1 on an UnslopNemo-12B-v4 base. - Parameter Count: 12 billion parameters.
- Context Length: Supports a context window of 32768 tokens.
- Merge Method: Utilizes the NuSLERP method for combining model weights.
Potential Use Cases
This model is suitable for general-purpose language generation, given its foundation in multiple pre-trained models. Its merged nature suggests a balanced performance across various tasks, making it a versatile option for developers exploring different language applications.