Name: cstr/llama3-8b-spaetzle-v20 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: cstr

Model Overview

cstr/llama3-8b-spaetzle-v20 is an 8 billion parameter language model created by cstr through a merge of two distinct models: cstr/llama3-8b-spaetzle-v13 and nbeerbower/llama-3-wissenschaft-8B-v2. The merging process utilized the dare_ties method, with specific parameters applied to nbeerbower/llama-3-wissenschaft-8B-v2 including a density of 0.65 and a weight of 0.4.

Performance Benchmarks

This model demonstrates competitive performance across several key benchmarks. On the EQ-Bench v2_de, it achieved a score of 65.7 with 171/171 parseable results. According to the Open LLM Leaderboard, cstr/llama3-8b-spaetzle-v20 has an average score of 71.83. Detailed scores include:

ARC: 70.39
HellaSwag: 85.69
MMLU: 68.52
TruthfulQA: 60.98
Winogrande: 78.37
GSM8K: 67.02

Key Characteristics

Architecture: Merged Llama 3-based models.
Parameter Count: 8 billion parameters.
Context Length: 8192 tokens.
Merging Method: dare_ties for combining model strengths.
Data Type: bfloat16 for efficient computation.

Usage Considerations

This model is suitable for a range of general-purpose language generation and understanding tasks, particularly where a balance of performance and efficiency for an 8B model is desired. Its benchmark scores suggest proficiency in common reasoning, factual recall, and language comprehension tasks.

Overview

Model Overview

Performance Benchmarks

Key Characteristics

Usage Considerations

Full Model Card (README)