monology/mixtral-ties

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Mar 27, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

monology/mixtral-ties is an experimental 7 billion parameter language model based on the Mistral-7B-v0.1 architecture, created by monology. This model is a merge of eight different Mixtral-slerp models using the TIES merge method, aiming to combine their capabilities. It is intended for experimental purposes to explore the effects of model merging.

Loading preview...

Overview

monology/mixtral-ties is an experimental 7 billion parameter language model developed by monology. It is constructed using the TIES merge method with mistralai/Mistral-7B-v0.1 as its base model. The primary goal of this model is for experimental exploration of model merging techniques.

Merge Details

This model integrates eight distinct monology/mixtral-slerp models, specifically mixtral-slerp0 through mixtral-slerp7. Each of these merged components was incorporated with a density of 0.5 and a weight of 0.1, as defined in the merge configuration. The merging process was facilitated by mergekit.

Key Characteristics

  • Base Architecture: Mistral-7B-v0.1
  • Merge Method: TIES (Task-Independent Ensemble of Subnetworks)
  • Parameter Count: 7 billion
  • Context Length: 4096 tokens
  • Experimental Nature: Explicitly noted as being for experimental purposes, suggesting its primary value lies in research and development rather than immediate production use.

Intended Use

This model is best suited for researchers and developers interested in:

  • Exploring the outcomes of the TIES model merging technique.
  • Experimenting with combinations of different fine-tuned models based on the Mistral architecture.
  • Understanding the practical implications and potential performance of merged models.