MisterRaven006/SweetNeural-7B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:May 15, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

MisterRaven006/SweetNeural-7B is a 7 billion parameter language model created by MisterRaven006 through a SLERP merge of KatyTheCutie/LemonadeRP-4.5.3 and mlabonne/NeuralBeagle14-7B. This model combines characteristics from its base models, offering a versatile foundation for various natural language processing tasks. Its 4096-token context length supports processing moderately long inputs.

Loading preview...

Overview

MisterRaven006/SweetNeural-7B is a 7 billion parameter language model developed by MisterRaven006. It was created using the SLERP (Spherical Linear Interpolation) merge method, combining two distinct pre-trained models: KatyTheCutie/LemonadeRP-4.5.3 and mlabonne/NeuralBeagle14-7B. This merging technique aims to blend the strengths and characteristics of its constituent models into a new, unified architecture.

Merge Details

The model's creation involved a specific configuration using mergekit, where layers from both mlabonne/NeuralBeagle14-7B and KatyTheCutie/LemonadeRP-4.5.3 were combined across a layer_range of [0, 32]. The base_model for the merge was mlabonne/NeuralBeagle14-7B. Parameters for self_attn and mlp layers were adjusted with varying t values, indicating a nuanced approach to how the weights of the merged models contribute to the final model's behavior.

Key Characteristics

  • Architecture: A merged model derived from KatyTheCutie/LemonadeRP-4.5.3 and mlabonne/NeuralBeagle14-7B.
  • Parameter Count: 7 billion parameters.
  • Context Length: Supports a context window of 4096 tokens.
  • Merge Method: Utilizes the SLERP (Spherical Linear Interpolation) method for combining model weights.

Potential Use Cases

Given its merged nature, SweetNeural-7B is likely suitable for a range of general-purpose natural language tasks, potentially inheriting capabilities from its base models. Developers looking for a 7B model with a blend of characteristics from established models might find this merge useful for experimentation or specific fine-tuning applications.