shadowml/BeagleSempra-7B
BeagleSempra-7B is a 7 billion parameter language model developed by shadowml, created by merging shadowml/WestBeagle-7B and FelixChao/Sectumsempra-7B-DPO using a slerp merge method. This model leverages the strengths of its constituent models, offering a 4096-token context length. It is designed for general language generation tasks, benefiting from the combined capabilities of its base components.
Loading preview...
BeagleSempra-7B Overview
BeagleSempra-7B is a 7 billion parameter language model developed by shadowml, constructed through a strategic merge of two distinct models: shadowml/WestBeagle-7B and FelixChao/Sectumsempra-7B-DPO. This integration was performed using the LazyMergekit tool, specifically employing the slerp (spherical linear interpolation) merge method.
Key Characteristics
- Architecture: A merged model combining the strengths of WestBeagle-7B and Sectumsempra-7B-DPO.
- Parameter Count: 7 billion parameters, offering a balance between performance and computational efficiency.
- Context Length: Supports a context window of 4096 tokens, suitable for processing moderately long inputs.
- Merge Method: Utilizes
slerpfor combining model weights, with specifictparameters applied to self_attn and mlp layers, indicating a fine-tuned merging process.
Usage
This model is suitable for various text generation tasks. Developers can easily integrate it into their Python projects using the Hugging Face transformers library, as demonstrated in the provided usage example. It supports standard text generation pipelines with customizable parameters like max_new_tokens, temperature, top_k, and top_p for controlling output creativity and coherence.