flemmingmiguel/MDBX-7B
MDBX-7B is a 7 billion parameter language model created by flemmingmiguel, formed by merging leveldevai/MarcDareBeagle-7B and leveldevai/MarcBeagle-7B using LazyMergekit. This model leverages a slerp merge method across its 32 layers, with specific parameter weighting for self_attn and mlp components. It is designed for general language generation tasks, offering a 4096-token context length.
Loading preview...
MDBX-7B: A Merged 7B Language Model
MDBX-7B is a 7 billion parameter language model developed by flemmingmiguel. This model is a product of merging two distinct models, leveldevai/MarcDareBeagle-7B and leveldevai/MarcBeagle-7B, utilizing the LazyMergekit framework.
Key Characteristics
- Architecture: A merge of two 7B models, combining their strengths.
- Merge Method: Employs a
slerp(spherical linear interpolation) merge method, allowing for nuanced blending of model weights. - Layer-Specific Weighting: The merge configuration applies specific weighting to
self_attnandmlplayers, indicating a fine-tuned approach to combining the source models' capabilities. - Context Length: Supports a context window of 4096 tokens.
Usage and Configuration
The model's configuration details the precise slerp parameters used across its 32 layers, with a fallback value for other tensors. Developers can easily integrate MDBX-7B into their projects using the transformers library, with provided Python code demonstrating how to load the model and generate text. It is optimized for float16 precision and designed for efficient deployment with device_map="auto".