Yhyu13/LMCocktail-Mistral-7B-v1

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Dec 27, 2023License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Yhyu13/LMCocktail-Mistral-7B-v1 is a 7 billion parameter language model based on the Mistral architecture, created by Yhyu13. This model is a 50%-50% merge of Mistral-7B-Instruct-v0.2 and xDAN-L1-Chat-RL-v1, leveraging a novel LM-cocktail merging technique. It demonstrates strong performance in conversational tasks, notably ranking highly on AlpacaEval, making it suitable for general-purpose chat and instruction-following applications.

Loading preview...

Model Overview

Yhyu13/LMCocktail-Mistral-7B-v1 is a 7 billion parameter language model developed by Yhyu13. It is constructed using a novel "LM-cocktail" merging technique, combining two high-performing Mistral-based models: Mistral-7B-Instruct-v0.2 and xDAN-L1-Chat-RL-v1, each contributing 50% to the final model. This approach aims to synthesize the strengths of its constituent models.

Key Capabilities & Performance

This model excels in conversational AI and instruction-following tasks. Notably, it achieved a high ranking on AlpacaEval, a benchmark for evaluating instruction-following models, where it was rated by ChatGPT as the second-best model, closely trailing GPT-4. This performance suggests strong capabilities in generating coherent and relevant responses to diverse prompts.

Unique Aspects

The core innovation of this model lies in its LM-cocktail merging technique, a method for combining multiple large language models. This technique is detailed in the paper "The LM-cocktail is novel technique for merging multiple models". The merging scripts are available in the model's repository, providing transparency into its construction. The model's 8192-token context length supports handling moderately long inputs and generating comprehensive outputs.

Good For

  • General-purpose chat applications and conversational AI.
  • Instruction-following tasks where high-quality, relevant responses are crucial.
  • Developers interested in model merging techniques and their practical applications.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p