Model Overview
EmbeddedLLM/Mistral-7B-Merge-02-v0 is a 7 billion parameter experimental language model built upon the Mistral-7B-v0.1 base. Its primary purpose is to compare the effectiveness of the DARE TIES merging method against SLERP, specifically by combining two distinct models: teknium/OpenHermes-2.5-Mistral-7B and Intel/neural-chat-7b-v3-3.
Key Characteristics
- Architecture: Based on Mistral-7B-v0.1.
- Merging Method: Utilizes the DARE TIES method with specific weight and density parameters (0.5 for each merged model).
- Experimental Focus: Aims to provide a direct comparison of DARE TIES performance against SLERP in model merging.
Performance Insights
Preliminary results on the Open LLM Leaderboard indicate that the DARE TIES merge (this model) shows slightly lower average scores compared to a SLERP-merged counterpart (Weyaxi/OpenHermes-2.5-neural-chat-v3-3-Slerp). However, the README notes that further tuning of the DARE TIES method might yield improved results. Specific benchmark comparisons include:
- Average: 70.69 (DARE TIES) vs 71.38 (SLERP)
- MMLU: 64.1 (DARE TIES) vs 64.26 (SLERP)
- TruthfulQA: 60.52 (DARE TIES) vs 62.78 (SLERP)
Use Cases
This model is particularly useful for researchers and developers interested in:
- Model Merging Research: Exploring and comparing different model merging techniques like DARE TIES and SLERP.
- Performance Analysis: Evaluating how different merging strategies impact benchmark performance across various tasks.