Name: cstr/llama3-8b-spaetzle-v13 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: cstr

Model Overview

cstr/llama3-8b-spaetzle-v13 is an 8 billion parameter language model, created by merging two distinct Llama 3-based models: Azure99/blossom-v5-llama3-8b and VAGOsolutions/Llama-3-SauerkrautLM-8b-Instruct. This merge was performed using the dare_ties method, with VAGOsolutions/Llama-3-SauerkrautLM-8b-Instruct serving as the base model.

Key Capabilities & Performance

This model is designed for general language tasks and exhibits strong performance in both English and German. It maintains the standard Llama 3 prompt format. Key benchmark results include:

EQ Bench v2_de: 64.14
English EQ-Bench Score (v2): 75.59
Average MMLU: 68.06
HellaSwag: 85.05
GSM8K: 67.1

The model demonstrates proficiency in arithmetic and multi-step reasoning, as shown in sample outputs for mathematical problems and complex scenarios.

Configuration Details

The merge configuration utilized a density of 0.65 and a weight of 0.4 for Azure99/blossom-v5-llama3-8b, with int8_mask enabled and dtype set to bfloat16. The tokenizer source is from the base model.

Ideal Use Cases

This model is suitable for applications requiring:

Multilingual text generation: Particularly strong in German and English.
Conversational AI: General chat and instruction-following tasks.
Reasoning and problem-solving: Demonstrated ability to handle logical and mathematical queries.

Overview

Model Overview

Key Capabilities & Performance

Configuration Details

Ideal Use Cases

Full Model Card (README)