saishf/Kuno-Lake-7B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Feb 3, 2024License:cc-by-nc-4.0Architecture:Transformer0.0K Open Weights Cold

Kuno-Lake-7B by saishf is a 7 billion parameter language model merged from Mistral-7B-v0.1, WestLake-7B-v2, and Kunoichi-DPO-v2-7B using the DARE TIES method. This merge aims to combine the strengths of its constituent models, achieving an average score of 73.56 on the Open LLM Leaderboard. It is suitable for general language tasks, with a context length of 4096 tokens.

Loading preview...

Model Overview

saishf/Kuno-Lake-7B is a 7 billion parameter language model created by saishf through a merge of pre-trained models. It utilizes the DARE TIES merge method, combining mistralai/Mistral-7B-v0.1 as its base with senseable/WestLake-7B-v2 and SanjiWatsuki/Kunoichi-DPO-v2-7B. The merge configuration applied specific density and weight parameters to the constituent models, with bfloat16 as the dtype.

Performance Highlights

Evaluated on the Open LLM Leaderboard, Kuno-Lake-7B demonstrates competitive performance across various benchmarks:

  • Average Score: 73.56
  • AI2 Reasoning Challenge (25-Shot): 71.84
  • HellaSwag (10-Shot): 88.15
  • MMLU (5-Shot): 64.76
  • TruthfulQA (0-shot): 66.83
  • Winogrande (5-shot): 84.45
  • GSM8k (5-shot): 65.35

Key Characteristics

  • Architecture: Merged from Mistral-7B-v0.1 and other fine-tuned models.
  • Merge Method: DARE TIES, known for effectively combining model capabilities.
  • Parameter Count: 7 billion parameters.
  • Context Length: 4096 tokens.

Use Cases

This model is well-suited for general-purpose language generation and understanding tasks, benefiting from the combined strengths of its merged components. Its balanced performance across reasoning, common sense, and factual recall benchmarks makes it a versatile choice for applications requiring robust language capabilities.