Tarek07/Primogenitor-V2.1-LLaMa-70B

Hugging Face
TEXT GENERATIONConcurrency Cost:4Model Size:70BQuant:FP8Ctx Length:32kLicense:llama3.3Architecture:Transformer0.0K Warm

Primogenitor-V2.1-LLaMa-70B is a 70 billion parameter language model created by Tarek07, merged using the Linear DELLA method with nbeerbower/Llama-3.1-Nemotron-lorablated-70B as its base. This model integrates components from six distinct LLaMa-based models, including Sao10K/L3.1-70B-Hanami-x1 and LatitudeGames/Wayfarer-Large-70B-Llama-3.3, to enhance its overall capabilities. With a 32768 token context length, it is designed for general-purpose language tasks, leveraging the strengths of its diverse merged constituents.

Loading preview...

Primogenitor V2.1: A Merged LLaMa-70B Model

Primogenitor-V2.1-LLaMa-70B is a 70 billion parameter language model developed by Tarek07, leveraging the Linear DELLA merge method to combine the strengths of multiple pre-trained models. This approach uses nbeerbower/Llama-3.1-Nemotron-lorablated-70B as its foundational base, integrating diverse LLaMa-based components to create a robust and versatile model.

Key Capabilities

  • Enhanced Performance: By merging six distinct LLaMa-based models, Primogenitor V2.1 aims to consolidate their individual strengths, potentially leading to improved performance across various language understanding and generation tasks.
  • Diverse Model Integration: The merge includes models such as Sao10K/L3.1-70B-Hanami-x1, Sao10K/70B-L3.3-Cirrus-x1, LatitudeGames/Wayfarer-Large-70B-Llama-3.3, SicariusSicariiStuff/Negative_LLAMA_70B, TheDrummer/Anubis-70B-v1, and EVA-UNIT-01/EVA-LLaMA-3.33-70B-v0.1.
  • Optimized Merging: The use of the Linear DELLA method with specific epsilon, lambda, and normalize parameters suggests a fine-tuned approach to combining model weights, aiming for optimal synergy.

Good For

  • General-purpose language tasks: Its broad base from multiple LLaMa models makes it suitable for a wide array of applications.
  • Researchers and developers: Those interested in exploring the outcomes of advanced model merging techniques, particularly with the DELLA method, will find this model valuable.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p