KaraKaraWitch/L3.1-70b-MeowMix

Hugging Face
TEXT GENERATIONConcurrency Cost:4Model Size:70BQuant:FP8Ctx Length:32kPublished:Aug 23, 2024Architecture:Transformer Warm

L3.1-70b-MeowMix by KaraKaraWitch is a 70 billion parameter merged language model based on the Llama 3.1 architecture. This model is a merge of several Llama 3.1-70B variants, including Tess-3, EZO-1.1-it, Chinese-Chat, and Korean-sft-dpo models, created using the ties merge method. While it functions well for English language tasks, it is explicitly noted as a failed attempt for CJK (Chinese, Japanese, Korean) languages, performing poorly in those contexts.

Loading preview...

L3.1-70b-MeowMix: A Merged Llama 3.1-70B Model

L3.1-70b-MeowMix is a 70 billion parameter language model developed by KaraKaraWitch, created through a merge of multiple Llama 3.1-70B base models using the ties merge method via LazyMergekit.

Key Characteristics & Composition

This model integrates components from:

  • migtissera/Tess-3-Llama-3.1-70B
  • HODACHI/Llama-3.1-70B-EZO-1.1-it
  • shenzhi-wang/Llama3.1-70B-Chinese-Chat
  • Saxo/Linkbricks-Horizon-AI-Korean-llama3.1-sft-dpo-70B

The merging process aimed to combine the strengths of these diverse models, with migtissera/Tess-3-Llama-3.1-70B serving as the base model with a higher density and weight in the merge configuration.

Important Limitation: CJK Language Performance

Crucially, L3.1-70b-MeowMix is explicitly identified as a failed attempt for CJK (Chinese, Japanese, Korean) languages. While it performs adequately for English, users should not use this model for CJK-related tasks as its performance in these languages is significantly compromised. This limitation stems from the merging strategy, which did not yield the desired multilingual capabilities for CJK.

Chat Format

The model utilizes the standard Llama 3 Instruct chat format for interactions.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p