TareksGraveyard/Thespian-LLaMa-70B

Cold
Public
70B
FP8
32768
4
License: llama3.3
Hugging Face
Overview

Overview

TareksGraveyard/Thespian-LLaMa-70B is a 70 billion parameter language model developed by TareksGraveyard. It is a product of experimental merging techniques, specifically utilizing the della merge method via mergekit. The model's foundation is nbeerbower/Llama-3.1-Nemotron-lorablated-70B, upon which several other Llama-based models were integrated.

Merge Details

This model was created by merging five distinct Llama-based models, each contributing equally with a 0.20 weight:

  • SicariusSicariiStuff/Negative_LLAMA_70B
  • Sao10K/70B-L3.3-Cirrus-x1
  • TheDrummer/Anubis-70B-v1
  • Sao10K/L3.3-70B-Euryale-v2.3
  • EVA-UNIT-01/EVA-LLaMA-3.33-70B-v0.1

The merge configuration also specified parameters such as density: 0.7, epsilon: 0.2, and lambda: 1.1, with bfloat16 as the data type. The primary goal of this merge was to explore combinations of existing high-performing Llama models, aiming to synthesize their individual strengths into a new, versatile model. While the creator notes that another experimental merge, Progenitor-V1.1-LLaMa-70B, might offer different characteristics, Thespian-LLaMa-70B represents a specific blend of these influential Llama variants.