FelixChao/Sirius-10B

TEXT GENERATIONConcurrency Cost:1Model Size:10.7BQuant:FP8Ctx Length:4kPublished:Jan 22, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Sirius-10B is a 10.7 billion parameter language model developed by FelixChao, created by merging leveldevai/TurdusBeagle-7B and FelixChao/Severus-7B. This model leverages a passthrough merge method to combine the strengths of its constituent models, offering a versatile foundation for various natural language processing tasks. Its architecture is designed for general-purpose text generation and understanding, suitable for applications requiring robust language capabilities.

Loading preview...

Sirius-10B: A Merged Language Model

Sirius-10B is a 10.7 billion parameter language model developed by FelixChao. It is constructed through a merge of two distinct 7B parameter models: leveldevai/TurdusBeagle-7B and FelixChao/Severus-7B.

Key Capabilities

  • Model Merging: Utilizes a passthrough merge method, combining specific layer ranges from its base models to create a new, potentially more capable model.
  • General-Purpose Language Understanding: Designed to handle a broad range of natural language processing tasks, leveraging the combined knowledge of its merged components.
  • Flexible Deployment: Compatible with standard Hugging Face transformers library for easy integration and inference.

Configuration Details

The merge configuration specifies that layers 0-24 are sourced from leveldevai/TurdusBeagle-7B, while layers 8-32 are sourced from FelixChao/Severus-7B. This overlapping layer selection, combined with the passthrough method, aims to synthesize the strengths of both models.

Good For

  • Developers looking for a merged model built from established 7B architectures.
  • Experimentation with model merging techniques and their impact on performance.
  • General text generation, question answering, and conversational AI applications where a 10B parameter model is suitable.