BruinHermes: A Slerp Merged Language Model
BruinHermes is a composite language model developed by cookinai, created through a slerp merge of two distinct base models: rwitz2/go-bruins-v2.1.1 and Weyaxi/OpenHermes-2.5-neural-chat-v3-3-Slerp. This merging technique aims to combine the strengths and characteristics of both foundational models into a single, more versatile entity.
Key Merging Details
- Merge Method: Slerp (Spherical Linear Interpolation) was utilized for the merging process, allowing for a nuanced combination of the model weights.
- Base Models: The merge incorporates
rwitz2/go-bruins-v2.1.1 and Weyaxi/OpenHermes-2.5-neural-chat-v3-3-Slerp, suggesting an intent to blend their respective capabilities. - Layer-Specific Parameters: The slerp merge was applied with specific
t values for different architectural components:self_attn layers received t values of [0, 0.5, 0.3, 0.7, 1].mlp layers received t values of [1, 0.5, 0.7, 0.3, 0].- A fallback
t value of 0.5 was used for all other tensors, ensuring a balanced contribution from both base models where specific tuning wasn't applied.
- Data Type: The merge was performed using
bfloat16 precision.
Potential Use Cases
Given its origin from two instruction-tuned models, BruinHermes is likely well-suited for:
- General-purpose conversational AI: Engaging in natural language dialogues.
- Instruction following: Executing commands and generating responses based on specific instructions.
- Text generation: Creating coherent and contextually relevant text for various applications.