MRAIRR/mini_7B_dare_v1

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jan 30, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

MRAIRR/mini_7B_dare_v1 is a 7 billion parameter language model merged by MRAIRR, based on mistralai/Mistral-7B-v0.1 with a 4096 token context length. This model was created using the DARE TIES merge method, combining OpenBuddy/openbuddy-mistral-7b-v13.1, MRAIRR/hubsalmon_tra, and EmbeddedLLM/Mistral-7B-Merge-14-v0.3. It is designed to leverage the strengths of its constituent models through a specific merging technique.

Loading preview...

MRAIRR/mini_7B_dare_v1: A DARE TIES Merged Mistral-7B Model

MRAIRR/mini_7B_dare_v1 is a 7 billion parameter language model built upon the mistralai/Mistral-7B-v0.1 base. This model distinguishes itself through its creation method, utilizing the advanced DARE TIES merge technique via mergekit.

Merge Details

This model integrates the capabilities of three distinct pre-trained models:

  • OpenBuddy/openbuddy-mistral-7b-v13.1
  • MRAIRR/hubsalmon_tra
  • EmbeddedLLM/Mistral-7B-Merge-14-v0.3

The DARE TIES method, as described in academic papers, allows for a strategic combination of model weights, aiming to preserve and enhance the individual strengths of the merged components. The configuration specifies a density of 0.53 and a weight of 0.4 for each contributing model, with int8_mask enabled and bfloat16 dtype for efficiency.

Potential Use Cases

Given its foundation on Mistral-7B and the specific merge strategy, mini_7B_dare_v1 is likely suitable for applications requiring:

  • General text generation and understanding based on the Mistral architecture.
  • Tasks benefiting from the combined knowledge of its diverse base models.
  • Exploration of models created with advanced merging techniques for research or specific performance tuning.