TareksGraveyard/Inception-LLaMa-70B

Hugging Face
TEXT GENERATIONConcurrency Cost:4Model Size:70BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Jan 29, 2025License:llama3.3Architecture:Transformer Warm

TareksGraveyard/Inception-LLaMa-70B is a 70 billion parameter language model, created by TareksGraveyard, that is a complex merge of multiple LLaMa-based models. This experimental merge uses the SCE method with nbeerbower/Llama-3.1-Nemotron-lorablated-70B as its base. It combines several existing merged models, aiming to synthesize their diverse capabilities into a single, robust model for general-purpose applications.

Loading preview...

Inception-LLaMa-70B: An Experimental Merge Model

Inception-LLaMa-70B is a 70 billion parameter language model developed by TareksGraveyard, representing an experimental approach to combining the strengths of several pre-trained LLaMa-based models. This model is a "merge of a merge of a merge," indicating its complex lineage from multiple prior merged models.

Merge Details

This model was created using the mergekit tool, specifically employing the SCE (Selective Channel Expansion) merge method. The base model for this intricate merge was nbeerbower/Llama-3.1-Nemotron-lorablated-70B.

Models Included in the Merge

Inception-LLaMa-70B integrates components from a diverse set of existing merged models, including:

This layered merging strategy aims to consolidate the varied capabilities and characteristics of its constituent models into a single, potentially more versatile, large language model. The model was configured with select_topk: 1.0 and uses bfloat16 for its dtype.