jan-hq/Mistral-7B-Instruct-v0.2-DARE

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Dec 12, 2023License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

jan-hq/Mistral-7B-Instruct-v0.2-DARE is a 7 billion parameter language model developed by jan-hq, created by merging Mistral-7B-Instruct-v0.2 with three top-performing models from the OpenLLM Leaderboard using the DARE method. This model is designed to combine the strengths of its constituent models, offering enhanced general instruction following and reasoning capabilities. It supports a 4096-token context length and is optimized for diverse conversational and analytical tasks.

Loading preview...

jan-hq/Mistral-7B-Instruct-v0.2-DARE Overview

This 7 billion parameter model, developed by jan-hq, is a merge of the base Mistral-7B-Instruct-v0.2 with three high-performing models from the OpenLLM Leaderboard as of December 12th:

Key Capabilities

  • Enhanced Performance: Leverages the DARE (DARE-TIES) merging method to combine the strengths of its base and merged models, aiming for improved instruction following and general reasoning.
  • Flexible Prompting: Supports both ChatML and Alpaca prompt templates for diverse application needs.
  • Offline Operation: Can be run locally using Jan Desktop, ensuring privacy and data confidentiality.

Open LLM Leaderboard Evaluation

This model achieved an Avg. score of 55.84 on the Open LLM Leaderboard, with specific scores including:

  • AI2 Reasoning Challenge (25-Shot): 61.95
  • HellaSwag (10-Shot): 75.62
  • MMLU (5-Shot): 49.99
  • TruthfulQA (0-shot): 54.36
  • Winogrande (5-shot): 74.98
  • GSM8k (5-shot): 18.12

Detailed results are available on the Open LLM Leaderboard.

Good For

  • Developers seeking a 7B parameter model with combined strengths from multiple top-tier instruction-tuned models.
  • Applications requiring local, confidential AI processing.
  • Experimentation with merged models and diverse prompting styles.