FredrikBL/FlashbackMist-dare
FredrikBL/FlashbackMist-dare is a 7 billion parameter language model created by FredrikBL, formed by merging three Mistral-7B-v0.1 based models: timpal0l/Mistral-7B-v0.1-flashback-v2, abacusai/Slerp-CM-mist-dpo, and EmbeddedLLM/Mistral-7B-Merge-14-v0.2. This model utilizes the dare_ties merge method and is built upon the Mistral-7B-v0.1 architecture, offering a 4096 token context length. It is designed to combine the strengths of its constituent models for general language generation tasks.
Loading preview...
Overview
FredrikBL/FlashbackMist-dare is a 7 billion parameter language model developed by FredrikBL. It is a product of merging three distinct Mistral-7B-v0.1 based models using the dare_ties merge method via LazyMergekit. The base model for this merge is mistralai/Mistral-7B-v0.1.
Key Components
This model is a composite of the following specialized Mistral-7B variants:
- timpal0l/Mistral-7B-v0.1-flashback-v2: Contributes with a density of 0.53 and a weight of 0.4 in the merge.
- abacusai/Slerp-CM-mist-dpo: Integrated with a density of 0.53 and a weight of 0.3.
- EmbeddedLLM/Mistral-7B-Merge-14-v0.2: Also included with a density of 0.53 and a weight of 0.3.
Configuration and Usage
The merge configuration specifies int8_mask: true and dtype: bfloat16, indicating optimizations for memory and performance. Developers can easily integrate FlashbackMist-dare into their projects using the Hugging Face transformers library, as demonstrated by the provided Python code snippet for text generation. The model supports a context length of 4096 tokens.
Good For
- General text generation tasks leveraging the combined capabilities of its merged components.
- Experimentation with models created via advanced merging techniques like
dare_ties. - Applications requiring a 7B parameter model with a standard context window.