KaraKaraWitch/L3.1-70b-MeowMix

Warm
Public
70B
FP8
32768
Aug 23, 2024
Hugging Face
Overview

L3.1-70b-MeowMix: A Merged Llama 3.1-70B Model

L3.1-70b-MeowMix is a 70 billion parameter language model developed by KaraKaraWitch, created through a merge of multiple Llama 3.1-70B base models using the ties merge method via LazyMergekit.

Key Characteristics & Composition

This model integrates components from:

  • migtissera/Tess-3-Llama-3.1-70B
  • HODACHI/Llama-3.1-70B-EZO-1.1-it
  • shenzhi-wang/Llama3.1-70B-Chinese-Chat
  • Saxo/Linkbricks-Horizon-AI-Korean-llama3.1-sft-dpo-70B

The merging process aimed to combine the strengths of these diverse models, with migtissera/Tess-3-Llama-3.1-70B serving as the base model with a higher density and weight in the merge configuration.

Important Limitation: CJK Language Performance

Crucially, L3.1-70b-MeowMix is explicitly identified as a failed attempt for CJK (Chinese, Japanese, Korean) languages. While it performs adequately for English, users should not use this model for CJK-related tasks as its performance in these languages is significantly compromised. This limitation stems from the merging strategy, which did not yield the desired multilingual capabilities for CJK.

Chat Format

The model utilizes the standard Llama 3 Instruct chat format for interactions.