Dampfinchen/Llama-3-8B-Ultra-Instruct-SaltSprinkle

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kLicense:llama3Architecture:Transformer Warm

Dampfinchen/Llama-3-8B-Ultra-Instruct-SaltSprinkle is an 8 billion parameter language model merged from NousResearch/Meta-Llama-3-8B-Instruct and Dampfinchen/Llama-3-8B-Ultra-Instruct using the DARE TIES method. This model aims to combine the strong base capabilities of Llama-3-8B-Instruct with enhanced roleplay, RAG, German language proficiency, and story writing from Ultra-Instruct. It maintains a context length of 8192 tokens and achieves an average score of 67.61 on the Open LLM Leaderboard.

Loading preview...

Model Overview

Dampfinchen/Llama-3-8B-Ultra-Instruct-SaltSprinkle is an 8 billion parameter language model created by Dampfinchen through a merge of two distinct Llama-3-8B variants. Utilizing the DARE TIES merge method, this model integrates the foundational strengths of NousResearch/Meta-Llama-3-8B-Instruct with the specialized enhancements of Dampfinchen/Llama-3-8B-Ultra-Instruct.

Key Capabilities

  • Enhanced Instruction Following: Retains the robust instruction-following capabilities of the base Llama-3-8B-Instruct model.
  • Specialized Content Generation: Aims to improve performance in specific areas such as:
    • Roleplay (RP)
    • Retrieval-Augmented Generation (RAG)
    • German language understanding and generation
    • Story writing
  • Merge Method: Employs the DARE TIES method, which selectively merges parameters to combine strengths while mitigating potential conflicts.

Performance Metrics

Evaluated on the Open LLM Leaderboard, the model achieved an average score of 67.61. Notable scores include:

  • AI2 Reasoning Challenge (25-Shot): 61.35
  • HellaSwag (10-Shot): 77.76
  • MMLU (5-Shot): 67.88
  • TruthfulQA (0-shot): 52.82
  • Winogrande (5-shot): 74.98
  • GSM8k (5-shot): 70.89

Good For

  • Applications requiring a balance of general instruction following and specialized creative or factual generation.
  • Use cases where improved German language capabilities are beneficial.
  • Developers looking for a Llama-3-8B variant with enhanced roleplay and story writing abilities.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p