cstr/llama3-8b-spaetzle-v20

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kLicense:llama3Architecture:Transformer0.0K Warm

cstr/llama3-8b-spaetzle-v20 is an 8 billion parameter language model, a merge of cstr/llama3-8b-spaetzle-v13 and nbeerbower/llama-3-wissenschaft-8B-v2 using the dare_ties method. This model achieves an average score of 71.83 on the Open LLM Leaderboard across various benchmarks, including 70.39 on ARC and 68.52 on MMLU. It is designed for general language generation tasks, demonstrating solid performance across common reasoning and knowledge-based evaluations.

Loading preview...

Model Overview

cstr/llama3-8b-spaetzle-v20 is an 8 billion parameter language model created by cstr through a merge of two distinct models: cstr/llama3-8b-spaetzle-v13 and nbeerbower/llama-3-wissenschaft-8B-v2. The merging process utilized the dare_ties method, with specific parameters applied to nbeerbower/llama-3-wissenschaft-8B-v2 including a density of 0.65 and a weight of 0.4.

Performance Benchmarks

This model demonstrates competitive performance across several key benchmarks. On the EQ-Bench v2_de, it achieved a score of 65.7 with 171/171 parseable results. According to the Open LLM Leaderboard, cstr/llama3-8b-spaetzle-v20 has an average score of 71.83. Detailed scores include:

  • ARC: 70.39
  • HellaSwag: 85.69
  • MMLU: 68.52
  • TruthfulQA: 60.98
  • Winogrande: 78.37
  • GSM8K: 67.02

Key Characteristics

  • Architecture: Merged Llama 3-based models.
  • Parameter Count: 8 billion parameters.
  • Context Length: 8192 tokens.
  • Merging Method: dare_ties for combining model strengths.
  • Data Type: bfloat16 for efficient computation.

Usage Considerations

This model is suitable for a range of general-purpose language generation and understanding tasks, particularly where a balance of performance and efficiency for an 8B model is desired. Its benchmark scores suggest proficiency in common reasoning, factual recall, and language comprehension tasks.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p