jsfs11/MoEv4Config-TestWeightedTIES-7b
jsfs11/MoEv4Config-TestWeightedTIES-7b is a 7 billion parameter language model created by jsfs11, formed by merging Kukedlc/NeuTrixOmniBe-7B-model-remix, PetroGPT/WestSeverus-7B-DPO, and vanillaOVO/supermario_v4 using the TIES merging method. This model is configured with specific density and weight parameters for its constituent models, and includes int8_masking and normalization. It achieves an average score of 75.39 on the Open LLM Leaderboard, demonstrating balanced performance across various reasoning and language understanding tasks.
Loading preview...
MoEv4Config-TestWeightedTIES-7b Overview
This 7 billion parameter model, developed by jsfs11, is a merged language model created using the TIES (Trimming and Expanding) merging method. It combines three distinct base models: Kukedlc/NeuTrixOmniBe-7B-model-remix, PetroGPT/WestSeverus-7B-DPO, and vanillaOVO/supermario_v4. The merging process involved specific density and weight configurations for each component model, along with int8 masking and normalization.
Key Capabilities & Performance
MoEv4Config-TestWeightedTIES-7b demonstrates solid performance across a range of benchmarks, achieving an average score of 75.39 on the Open LLM Leaderboard. Its specific benchmark results include:
- AI2 Reasoning Challenge (25-Shot): 71.59
- HellaSwag (10-Shot): 88.19
- MMLU (5-Shot): 65.07
- TruthfulQA (0-shot): 70.87
- Winogrande (5-shot): 83.82
- GSM8k (5-shot): 72.78
These scores indicate a balanced capability in reasoning, common sense, language understanding, and mathematical problem-solving.
Good for
- Applications requiring a general-purpose 7B model with balanced performance across various tasks.
- Use cases benefiting from a model optimized through a sophisticated merging technique like TIES.
- Developers looking for a model with demonstrated capabilities in reasoning, factual recall, and language comprehension, as indicated by its leaderboard scores.