jan-hq/supermario-slerp-v3
The jan-hq/supermario-slerp-v3 is a 7 billion parameter language model created by Jan HQ using the Slerp merge method. It combines the strengths of supermario-slerp-v2 and supermario-v2, focusing on general language understanding and generation. This model achieves an average score of 72.22 on the Open LLM Leaderboard, demonstrating solid performance across various reasoning and comprehension tasks.
Loading preview...
Model Overview
The jan-hq/supermario-slerp-v3 is a 7 billion parameter language model developed by Jan HQ. It is a product of a model merging experiment, specifically utilizing the Slerp merge method to combine two prior models: supermario-slerp-v2 and supermario-v2. This approach aims to leverage the capabilities of both base models to create a more robust and versatile language model.
Key Capabilities & Performance
This model demonstrates strong general-purpose language understanding and generation. Its performance has been evaluated on the Open LLM Leaderboard, where it achieved an average score of 72.22. Specific benchmark results include:
- AI2 Reasoning Challenge (25-Shot): 69.28
- HellaSwag (10-Shot): 86.71
- MMLU (5-Shot): 65.11
- TruthfulQA (0-shot): 61.77
- Winogrande (5-shot): 80.51
- GSM8k (5-shot): 69.98
Unique Aspects
- Slerp Merge Method: This model is a direct result of Jan HQ's exploration into advanced model merging techniques, specifically the Slerp method, to optimize model performance by combining existing strong models.
- Open-Source Ecosystem Focus: Jan HQ is committed to building infrastructure and tooling for the open-source AI ecosystem, with this model serving as part of their ongoing research and development efforts.
Usage
This model can be run using Jan Desktop, an open-source, offline-first ChatGPT alternative available for Mac, Windows, and Linux. Jan Desktop ensures conversational privacy and offers OpenAI-compatible endpoints for local server interaction.