EmbeddedLLM/Mistral-7B-Merge-14-v0.3
EmbeddedLLM/Mistral-7B-Merge-14-v0.3 is a 7 billion parameter language model based on the Mistral architecture, developed by EmbeddedLLM. This model is an experimental merge of 14 different Mistral-7B variants using the DARE TIES method, designed to combine their strengths. It offers a 4096-token context length and aims to provide a robust base model for various applications, particularly those benefiting from a blend of capabilities from its constituent models.
Loading preview...
Model Overview
EmbeddedLLM/Mistral-7B-Merge-14-v0.3 is a 7 billion parameter model built upon the Mistral architecture, created by EmbeddedLLM. This model represents an experimental merge of 14 distinct Mistral-7B-based models using the DARE TIES method. The merging process specifically excluded models with potential TruthfulQA contamination or non-commercial licenses, ensuring a commercially viable and refined base model.
Key Characteristics
- Merged Architecture: Combines the strengths of 14 different Mistral-7B variants, including models like
mistralai/Mistral-7B-Instruct-v0.2,ehartford/dolphin-2.2.1-mistral-7b,SciPhi/SciPhi-Mistral-7B-32k,meta-math/MetaMath-Mistral-7B, andHuggingFaceH4/zephyr-7b-beta. - Performance: Achieves an average score of 69.66 on the Open LLM Leaderboard, with specific scores including 65.96 on ARC, 85.29 on HellaSwag, 64.35 on MMLU, 57.80 on TruthfulQA, 78.30 on Winogrande, and 66.26 on GSM8K.
- Context Length: Supports a context window of 4096 tokens.
Use Cases
This model is intended as a strong base model that may benefit from further chat fine-tuning. Its merged nature suggests a broad range of capabilities, making it suitable for general-purpose language tasks where a blend of different model strengths is advantageous. Users can experiment with chat templates like ChatML or Llama-2 for optimal interaction.