saishf/Top-Western-Maid-7B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Feb 4, 2024License:cc-by-nc-4.0Architecture:Transformer0.0K Open Weights Cold

saishf/Top-Western-Maid-7B is a 7 billion parameter language model created by saishf, merged using the DARE TIES method with Mistral-7B-v0.1 as its base. This model integrates capabilities from Noromaid-7B-0.4-DPO, Toppy-M-7B, and WestLake-7B-v2, aiming for a balanced performance across various reasoning and language understanding tasks. It achieves an average score of 71.57 on the Open LLM Leaderboard, making it suitable for general-purpose applications requiring robust language generation and comprehension.

Loading preview...

Model Overview

saishf/Top-Western-Maid-7B is a 7 billion parameter language model developed by saishf, built upon the mistralai/Mistral-7B-v0.1 base model. It was created using the DARE TIES merge method, combining the strengths of three distinct models: NeverSleep/Noromaid-7B-0.4-DPO, Undi95/Toppy-M-7B, and senseable/WestLake-7B-v2.

Key Capabilities & Performance

This merged model demonstrates strong performance across a range of benchmarks, achieving an average score of 71.57 on the Open LLM Leaderboard. Notable scores include:

  • AI2 Reasoning Challenge (25-Shot): 69.37
  • HellaSwag (10-Shot): 87.40
  • MMLU (5-Shot): 64.63
  • Winogrande (5-Shot): 83.27
  • GSM8k (5-Shot): 65.96

Merge Details

The DARE TIES merge configuration utilized specific density and weight parameters for each contributing model, with an int8_mask applied and bfloat16 dtype. This approach aims to leverage the distinct characteristics of the constituent models to create a versatile and capable language model.

Ideal Use Cases

Given its balanced performance across reasoning, common sense, and language understanding tasks, Top-Western-Maid-7B is well-suited for general-purpose applications where a 7B parameter model with a 4096-token context length is appropriate. It can be used for tasks requiring robust text generation, question answering, and logical inference.