TareksGraveyard/Inception-LLaMa-70B

Warm
Public
70B
FP8
32768
License: llama3.3
Hugging Face
Overview

Model Overview

TareksGraveyard/Inception-LLaMa-70B is a 70 billion parameter language model developed by TareksGraveyard. This model represents an experimental merge of multiple pre-trained Llama-based models, utilizing the SCE (Selective Channel Expansion) merge method. The base model for this merge is nbeerbower/Llama-3.1-Nemotron-lorablated-70B.

Merge Composition

The Inception-LLaMa-70B model integrates components from five distinct Llama-based models:

This complex merging strategy aims to consolidate the diverse capabilities and knowledge encoded within these individual models into a single, robust offering. The merge was configured to select top-k parameters with a value of 1.0 and processed using bfloat16 dtype.

Intended Use

Given its experimental nature as a "merge of a merge of a merge," this model is suitable for users looking to explore the aggregated performance characteristics of its constituent models. It is designed for general-purpose language tasks, leveraging the combined strengths of its diverse Llama-based origins.