KaraKaraWitch/L3.1-70b-Milasha

Hugging Face
TEXT GENERATIONConcurrency Cost:4Model Size:70BQuant:FP8Ctx Length:32kPublished:Aug 23, 2024License:cc-by-nc-4.0Architecture:Transformer Open Weights Warm

KaraKaraWitch/L3.1-70b-Milasha is a 70 billion parameter language model developed by KaraKaraWitch, representing the third and final series based on Llama 3.1. This model is a merge of several Llama 3.1-70B models, including Fizzarolli/L3.1-70b-glitz-v0.2 and KaraKaraWitch/L3.1-70b-MeowMix2, with MeowMix specifically providing language reinforcement for CJK. It is designed as a general-purpose model, combining strengths from its constituent merges.

Loading preview...

Overview

KaraKaraWitch/L3.1-70b-Milasha is the third and final model series from KaraKaraWitch built upon the Llama 3.1 architecture. This 70 billion parameter model is a strategic merge of five distinct Llama 3.1-70B models, created using LazyMergekit.

Key Merged Components

The model integrates several notable Llama 3.1-70B variants:

  • Fizzarolli/L3.1-70b-glitz-v0.2
  • KaraKaraWitch/L3.1-70b-MeowMix2
  • sophosympatheia/New-Dawn-Llama-3.1-70B-v1.1
  • nothingiisreal/L3.1-70B-Celeste-V0.1-BF16
  • Sao10K/L3.1-70B-Hanami-x1

Differentiators

Milasha's primary distinction lies in its merging strategy, aiming to combine the strengths of its constituent models. Notably, KaraKaraWitch/L3.1-70b-MeowMix2 was specifically included for language reinforcement in CJK (Chinese, Japanese, Korean), suggesting enhanced performance in these languages compared to models without such specialized components. The model adheres to the Llama 3 Instruct chat format.

Licensing

Following the preferences of Sao10k, one of the merged model's creators, L3.1-70b-Milasha is licensed under CC-BY-NC-4.0 (Creative Commons Attribution-NonCommercial 4.0 International).

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p