EmbeddedLLM/Mistral-7B-Merge-14-v0.4

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jan 3, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

The EmbeddedLLM/Mistral-7B-Merge-14-v0.4 is a 7 billion parameter language model developed by EmbeddedLLM, created through a multi-stage merging process. It combines 14 models using DARE TIES, followed by a Gradient SLERP merge with Weyaxi/OpenHermes-2.5-neural-chat-v3-3-openchat-3.5-1210-Slerp. This experimental model achieves an average score of 71.19 on the Open LLM Leaderboard, demonstrating solid performance across various benchmarks including ARC, HellaSwag, MMLU, TruthfulQA, Winogrande, and GSM8K, and is suitable for general language tasks.

Loading preview...

Model Overview

EmbeddedLLM/Mistral-7B-Merge-14-v0.4 is a 7 billion parameter experimental language model developed by EmbeddedLLM. It is the result of a sophisticated, multi-stage merging process designed to combine the strengths of multiple base models. Initially, 14 different models were merged using the DARE TIES method to create an intermediate version. This intermediate model was then further merged with Weyaxi/OpenHermes-2.5-neural-chat-v3-3-openchat-3.5-1210-Slerp using Gradient SLERP.

Key Capabilities & Performance

This merged model demonstrates strong general performance, achieving an average score of 71.19 on the Open LLM Leaderboard. Specific benchmark results include:

  • ARC: 66.81
  • HellaSwag: 86.15
  • MMLU: 65.10
  • TruthfulQA: 58.25
  • Winogrande: 80.03
  • GSM8K: 70.81

Usage & Characteristics

The model supports both ChatML and Llama-2 chat templates, offering flexibility for integration into various applications. While it performs well across a range of tasks, the developers note that it "may require further instruction fine-tuning" for optimal performance in specific instruction-following scenarios. The merging configuration details, including the use of slerp for specific tensor filters, are publicly available, providing transparency into its construction.