jan-hq/supermario-v1

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Dec 11, 2023License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

The jan-hq/supermario-v1 is a 7 billion parameter language model based on the Mistral-7B-v0.1 architecture, created by Jan using the DARE_TIES merge method. This model integrates components from OpenHermes-2.5, MetaMath-Cybertron-Starling, Magicoder-S-CL-7B, and Marcoroni-7B-v3, aiming to combine their respective strengths. With a 4096-token context length, it is designed for general language tasks, though its specific optimizations are derived from its merged components.

Loading preview...

Model Overview

The jan-hq/supermario-v1 is a 7 billion parameter language model developed by Jan, utilizing the DARE_TIES merge method. It is built upon the mistralai/Mistral-7B-v0.1 base model and incorporates elements from several other models, including Weyaxi/OpenHermes-2.5-neural-chat-v3-3-Slerp, Q-bert/MetaMath-Cybertron-Starling, ise-uiuc/Magicoder-S-CL-7B, and AIDC-ai-business/Marcoroni-7B-v3. This merging approach aims to combine diverse capabilities into a single model.

Key Characteristics

  • Architecture: Based on Mistral-7B-v0.1, merged using DARE_TIES.
  • Parameter Count: 7 billion parameters.
  • Context Length: Supports a context window of 4096 tokens.
  • Merged Components: Integrates models known for various strengths, though specific performance gains from each component are not detailed in the README.

Performance Insights

According to the Open LLM Leaderboard evaluations, the model achieves an average score of 29.49. Specific scores include:

  • AI2 Reasoning Challenge (25-Shot): 27.73
  • HellaSwag (10-Shot): 25.83
  • MMLU (5-Shot): 27.04
  • TruthfulQA (0-shot): 47.27
  • Winogrande (5-shot): 49.09
  • GSM8k (5-shot): 0.00

Usage

This model can be run locally using Jan Desktop, an open-source, offline-first ChatGPT alternative compatible with Mac, Windows, and Linux. Jan Desktop provides an OpenAI-compatible local server.