Name: Ashapu/anarva-8b-merged API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Ashapu

Model Overview

Ashapu/anarva-8b-merged is an 8 billion parameter language model built upon the Llama 3.1 architecture. It was created by Ashapu using the mergekit tool, specifically employing the DARE TIES merge method. This approach combines the weights of multiple pre-trained models to synthesize their capabilities into a single, more robust model.

Merge Details

The model integrates four distinct Llama 3.1-8B base models, with NousResearch/Hermes-3-Llama-3.1-8B serving as the primary base. The other merged components include:

cognitivecomputations/dolphin-2.9.4-llama3.1-8b
arcee-ai/Llama-3.1-SuperNova-Lite
deepseek-ai/DeepSeek-R1-Distill-Llama-8B

The DARE TIES configuration involved specific density and weight parameters for each contributing model, aiming to optimize the combined performance. The tokenizer configuration also incorporates special tokens from dolphin-2.9.4-llama3.1-8b and DeepSeek-R1-Distill-Llama-8B to enhance its understanding and generation capabilities.

Key Characteristics

Architecture: Llama 3.1-based, 8 billion parameters.
Merge Method: DARE TIES, combining four high-quality Llama 3.1-8B models.
Context Length: Supports a context window of 32768 tokens.
Tokenizer: Features a union tokenizer with specific tokens from merged models, including <|im_start|>, <|im_end|>, and <think>.

Potential Use Cases

This merged model is suitable for a wide range of generative AI applications, benefiting from the diverse strengths of its constituent models. Its Llama 3.1 foundation and the DARE TIES merging technique suggest potential for improved reasoning, instruction following, and general language understanding tasks.

Overview

Model Overview

Merge Details

Key Characteristics

Potential Use Cases

Full Model Card (README)