darkc0de/Xortron7MethedUp-SLERP-8B

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Sep 9, 2024Architecture:Transformer0.0K Cold

darkc0de/Xortron7MethedUp-SLERP-8B is an 8 billion parameter language model created by darkc0de, merged using the SLERP method from mlabonne/NeuralDaredevil-8B-abliterated and mlabonne/Hermes-3-Llama-3.1-8B-lorablated. This model leverages the strengths of its base components, offering a versatile foundation for various generative AI tasks. It is designed for general-purpose language generation and understanding within an 8192-token context window.

Loading preview...

Model Overview

darkc0de/Xortron7MethedUp-SLERP-8B is an 8 billion parameter language model, developed by darkc0de, that combines the capabilities of two distinct base models: mlabonne/NeuralDaredevil-8B-abliterated and mlabonne/Hermes-3-Llama-3.1-8B-lorablated. This model was created using the SLERP (Spherical Linear Interpolation) merge method, a technique known for smoothly blending the weights of different models to achieve a balanced performance profile.

Merge Details

The merge process involved specific configurations to optimize the combination of the source models. The mlabonne/NeuralDaredevil-8B-abliterated model served as the base, with both source models contributing their full layer ranges (0 to 32). The SLERP method was applied with varying t parameters for different components, specifically for self_attn and mlp layers, to fine-tune the influence of each base model on the final architecture. The model uses bfloat16 for its data type.

Key Characteristics

  • Parameter Count: 8 billion parameters.
  • Merge Method: Utilizes the SLERP method for weight interpolation.
  • Base Models: Merges mlabonne/NeuralDaredevil-8B-abliterated and mlabonne/Hermes-3-Llama-3.1-8B-lorablated.
  • Context Length: Supports an 8192-token context window.

Potential Use Cases

This merged model is suitable for a range of applications where a robust 8B parameter model with a balanced blend of capabilities is beneficial. Its foundation in well-regarded base models suggests potential for:

  • General text generation and completion.
  • Instruction following and conversational AI.
  • Tasks requiring nuanced language understanding, benefiting from the combined strengths of its constituents.