shadowml/BeagleSempra-7B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jan 30, 2024License:cc-by-nc-4.0Architecture:Transformer0.0K Open Weights Cold

BeagleSempra-7B is a 7 billion parameter language model developed by shadowml, created by merging shadowml/WestBeagle-7B and FelixChao/Sectumsempra-7B-DPO using a slerp merge method. This model leverages the strengths of its constituent models, offering a 4096-token context length. It is designed for general language generation tasks, benefiting from the combined capabilities of its base components.

Loading preview...

BeagleSempra-7B Overview

BeagleSempra-7B is a 7 billion parameter language model developed by shadowml, constructed through a strategic merge of two distinct models: shadowml/WestBeagle-7B and FelixChao/Sectumsempra-7B-DPO. This integration was performed using the LazyMergekit tool, specifically employing the slerp (spherical linear interpolation) merge method.

Key Characteristics

  • Architecture: A merged model combining the strengths of WestBeagle-7B and Sectumsempra-7B-DPO.
  • Parameter Count: 7 billion parameters, offering a balance between performance and computational efficiency.
  • Context Length: Supports a context window of 4096 tokens, suitable for processing moderately long inputs.
  • Merge Method: Utilizes slerp for combining model weights, with specific t parameters applied to self_attn and mlp layers, indicating a fine-tuned merging process.

Usage

This model is suitable for various text generation tasks. Developers can easily integrate it into their Python projects using the Hugging Face transformers library, as demonstrated in the provided usage example. It supports standard text generation pipelines with customizable parameters like max_new_tokens, temperature, top_k, and top_p for controlling output creativity and coherence.