paulml/OmniBeagleSquaredMBX-v3-7B-v2

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Feb 9, 2024License:cc-by-nc-4.0Architecture:Transformer0.0K Open Weights Cold

OmniBeagleSquaredMBX-v3-7B-v2 by paulml is a 7 billion parameter language model created by merging paulml/OmniBeagleMBX-v3-7B and flemmingmiguel/MBX-7B-v3 using LazyMergekit. This model leverages a slerp merge method with specific parameter weighting for self_attn and mlp layers, aiming to combine the strengths of its constituent models. It is designed for general text generation tasks, offering a 4096-token context window.

Loading preview...

OmniBeagleSquaredMBX-v3-7B-v2 Overview

OmniBeagleSquaredMBX-v3-7B-v2 is a 7 billion parameter language model developed by paulml. This model is a product of a strategic merge using LazyMergekit, combining two distinct base models: paulml/OmniBeagleMBX-v3-7B and flemmingmiguel/MBX-7B-v3.

Key Capabilities

  • Merged Architecture: Utilizes a slerp (spherical linear interpolation) merge method, applying specific weighting to self_attn and mlp layers across the full 32-layer range of the base models. This approach aims to create a balanced model that integrates the strengths of its components.
  • General Text Generation: Suitable for a wide array of text generation tasks, leveraging its 7 billion parameters to produce coherent and contextually relevant outputs.
  • Standard Context Window: Supports a context length of 4096 tokens, allowing for processing and generating moderately long sequences of text.

Good for

  • Experimenting with merged models and understanding the effects of different merge configurations.
  • Applications requiring a 7B parameter model for general-purpose text generation.
  • Developers looking for a model with a 4096-token context for various NLP tasks.