flemmingmiguel/MBX-7B-v2

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jan 29, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

flemmingmiguel/MBX-7B-v2 is a 7 billion parameter language model created by flemmingmiguel, built through a slerp merge of flemmingmiguel/MBX-7B and flemmingmiguel/MBX-7B-v2. This model leverages a specific merging configuration to combine the strengths of its base models, offering a unique blend of capabilities. It is designed for general text generation tasks within its 4096-token context window.

Loading preview...

Overview

MBX-7B-v2 is a 7 billion parameter language model developed by flemmingmiguel. This model is a product of a slerp merge using LazyMergekit, combining two distinct models: flemmingmiguel/MBX-7B and flemmingmiguel/MBX-7B-v2.

Merge Configuration

The merge process involved a specific configuration, applying different t values to various components of the model architecture:

  • Self-attention layers (self_attn): A range of t values [0, 0.5, 0.3, 0.7, 1] was applied.
  • MLP layers (mlp): A different range of t values [1, 0.5, 0.7, 0.3, 0] was used.
  • Fallback: A t value of 0.45 was used for all other tensors not explicitly covered by the above filters.

This detailed merging strategy aims to create a model with a balanced integration of features from its constituent parts. The model operates with a float16 data type and supports a context length of 4096 tokens.

Usage

Developers can easily integrate MBX-7B-v2 into their projects using the transformers library, as demonstrated by the provided Python code snippet for text generation tasks.