Overview
TurdusBeagle-7B is a 7 billion parameter language model developed by leveldevai. It is a merged model, created using LazyMergekit from two base models: udkai/Turdus and mlabonne/NeuralBeagle14-7B.
Merging Strategy
The model employs a slerp (spherical linear interpolation) merge method. This technique combines the weights of the two base models with specific interpolation values (t) applied to different components:
- Self-attention layers (
self_attn): Interpolation values range from 0 to 1, with specific values for different layers. - MLP layers (
mlp): Interpolation values also range from 0 to 1, with a different distribution than self-attention. - Other tensors: A fallback value of 0.45 is used for the remaining tensors.
This precise configuration aims to leverage the distinct characteristics of both Turdus and NeuralBeagle14-7B, potentially enhancing overall performance or specializing in certain tasks.
Usage
The model can be easily integrated into Python applications using the transformers library. It supports standard text generation pipelines, allowing users to generate responses based on chat templates. The provided example demonstrates how to load the model and tokenizer, apply a chat template, and generate text with specified parameters like max_new_tokens, temperature, top_k, and top_p.