jan-hq/supermario-slerp-v2

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Dec 12, 2023License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

jan-hq/supermario-slerp-v2 is a 7 billion parameter language model created by Jan, utilizing the Slerp merge method to combine v1olet_marcoroni-go-bruins-merge-7B and juanako-7b-UNA. This model is a test project for exploring model merging techniques. It achieves an average score of 71.35 on the Open LLM Leaderboard, demonstrating capabilities across various reasoning and language understanding tasks within a 4096 token context window.

Loading preview...

Model Overview

jan-hq/supermario-slerp-v2 is a 7 billion parameter language model developed by Jan, created as a test project for model merging. It leverages the Slerp merge method to combine two distinct models: v1olet_marcoroni-go-bruins-merge-7B and juanako-7b-UNA. The base model for this merge is v1olet_marcoroni-go-bruins-merge-7B.

Key Capabilities & Performance

This model demonstrates solid performance across a range of benchmarks, as evaluated on the Open LLM Leaderboard. It achieves an average score of 71.35, with specific results including:

  • AI2 Reasoning Challenge (25-Shot): 69.37
  • HellaSwag (10-Shot): 86.60
  • MMLU (5-Shot): 64.91
  • TruthfulQA (0-Shot): 62.96
  • Winogrande (5-Shot): 80.82
  • GSM8k (5-Shot): 63.46

Usage and Development

This model can be run locally using Jan Desktop, an open-source, offline-first ChatGPT alternative. Jan Desktop offers a confidential environment with an open file format and OpenAI-compatible endpoints. The development of this model acknowledges contributions from mergekit, DARE, SLERP, and lm-evaluation-harness.