cookinai/BruinHermes

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Dec 17, 2023License:unknownArchitecture:Transformer Cold

BruinHermes is a merged language model created by cookinai, combining rwitz2/go-bruins-v2.1.1 and Weyaxi/OpenHermes-2.5-neural-chat-v3-3-Slerp using a slerp merge method. This model leverages the strengths of its constituent models, with specific slerp parameters applied to self_attn and mlp layers to optimize performance. It is designed to offer a balanced blend of capabilities from both base models, suitable for general conversational and instruction-following tasks.

Loading preview...

BruinHermes: A Slerp Merged Language Model

BruinHermes is a composite language model developed by cookinai, created through a slerp merge of two distinct base models: rwitz2/go-bruins-v2.1.1 and Weyaxi/OpenHermes-2.5-neural-chat-v3-3-Slerp. This merging technique aims to combine the strengths and characteristics of both foundational models into a single, more versatile entity.

Key Merging Details

  • Merge Method: Slerp (Spherical Linear Interpolation) was utilized for the merging process, allowing for a nuanced combination of the model weights.
  • Base Models: The merge incorporates rwitz2/go-bruins-v2.1.1 and Weyaxi/OpenHermes-2.5-neural-chat-v3-3-Slerp, suggesting an intent to blend their respective capabilities.
  • Layer-Specific Parameters: The slerp merge was applied with specific t values for different architectural components:
    • self_attn layers received t values of [0, 0.5, 0.3, 0.7, 1].
    • mlp layers received t values of [1, 0.5, 0.7, 0.3, 0].
    • A fallback t value of 0.5 was used for all other tensors, ensuring a balanced contribution from both base models where specific tuning wasn't applied.
  • Data Type: The merge was performed using bfloat16 precision.

Potential Use Cases

Given its origin from two instruction-tuned models, BruinHermes is likely well-suited for:

  • General-purpose conversational AI: Engaging in natural language dialogues.
  • Instruction following: Executing commands and generating responses based on specific instructions.
  • Text generation: Creating coherent and contextually relevant text for various applications.