Undi95/Miqu-MS-70B

TEXT GENERATIONConcurrency Cost:4Model Size:69BQuant:FP8Ctx Length:32kPublished:Mar 30, 2024License:cc-by-nc-4.0Architecture:Transformer0.0K Open Weights Cold

Undi95/Miqu-MS-70B is a 69 billion parameter language model created by Undi95 using the Model Stock merge method. This model combines several pre-trained language models, including MiquMaid-v2-70B, Midnight-Miqu-70B-v1.0, and Tess-70B-v1.6, with 152334H/miqu-1-70b-sf as its base. It is designed to leverage the strengths of its constituent models, offering broad applicability for various text generation tasks with a 32768 token context length.

Loading preview...

Miqu-MS-70B Overview

Miqu-MS-70B is a 69 billion parameter language model developed by Undi95, created through a novel application of the Model Stock merge method. This approach combines the capabilities of multiple pre-trained models, using 152334H/miqu-1-70b-sf as the foundational base.

Key Merge Details

The model integrates several prominent 70B models to enhance its overall performance and address potential gaps in the base model. The merged components include:

  • migtissera/Tess-70B-v1.6
  • NeverSleep/MiquMaid-v2-70B
  • sophosympatheia/Midnight-Miqu-70B-v1.0

This strategic merge aims to create a robust and versatile model for a wide range of language understanding and generation tasks.

Prompt Formats

To ensure broad compatibility and ease of use, Miqu-MS-70B supports multiple common prompt formats, including:

  • Alpaca: ### Instruction:\n{system prompt}\n\n### Input:\n{prompt}\n\n### Response:\n{output}
  • Mistral: [INST] {prompt} [/INST]
  • Vicuna: SYSTEM: <ANY SYSTEM CONTEXT>\nUSER: \nASSISTANT:

This flexibility allows developers to integrate the model into existing workflows with minimal adjustments. The model also has a context length of 32768 tokens.