FredrikBL/FlashbackMist-dare

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Mar 28, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

FredrikBL/FlashbackMist-dare is a 7 billion parameter language model created by FredrikBL, formed by merging three Mistral-7B-v0.1 based models: timpal0l/Mistral-7B-v0.1-flashback-v2, abacusai/Slerp-CM-mist-dpo, and EmbeddedLLM/Mistral-7B-Merge-14-v0.2. This model utilizes the dare_ties merge method and is built upon the Mistral-7B-v0.1 architecture, offering a 4096 token context length. It is designed to combine the strengths of its constituent models for general language generation tasks.

Loading preview...

Overview

FredrikBL/FlashbackMist-dare is a 7 billion parameter language model developed by FredrikBL. It is a product of merging three distinct Mistral-7B-v0.1 based models using the dare_ties merge method via LazyMergekit. The base model for this merge is mistralai/Mistral-7B-v0.1.

Key Components

This model is a composite of the following specialized Mistral-7B variants:

  • timpal0l/Mistral-7B-v0.1-flashback-v2: Contributes with a density of 0.53 and a weight of 0.4 in the merge.
  • abacusai/Slerp-CM-mist-dpo: Integrated with a density of 0.53 and a weight of 0.3.
  • EmbeddedLLM/Mistral-7B-Merge-14-v0.2: Also included with a density of 0.53 and a weight of 0.3.

Configuration and Usage

The merge configuration specifies int8_mask: true and dtype: bfloat16, indicating optimizations for memory and performance. Developers can easily integrate FlashbackMist-dare into their projects using the Hugging Face transformers library, as demonstrated by the provided Python code snippet for text generation. The model supports a context length of 4096 tokens.

Good For

  • General text generation tasks leveraging the combined capabilities of its merged components.
  • Experimentation with models created via advanced merging techniques like dare_ties.
  • Applications requiring a 7B parameter model with a standard context window.