VAGOsolutions/Llama-3-SauerkrautLM-8b-Instruct

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Apr 19, 2024License:llama3Architecture:Transformer0.1K Warm

VAGOsolutions/Llama-3-SauerkrautLM-8b-Instruct is an 8 billion parameter Llama-3-based instruction-tuned language model developed jointly by VAGO Solutions and Hyperspace.ai. This model is fine-tuned using DPO with curated German data, significantly enhancing its capabilities in both German and English. It is optimized for general conversational tasks and demonstrates strong performance across various benchmarks, including a German RAG LLM Evaluation score of 0.910.

Loading preview...

VAGOsolutions/Llama-3-SauerkrautLM-8b-Instruct Overview

VAGOsolutions/Llama-3-SauerkrautLM-8b-Instruct is an 8 billion parameter instruction-tuned model, a collaborative effort between VAGO Solutions and Hyperspace.ai. It is built upon Meta's Llama-3-8B-Instruct and has undergone a two-stage DPO (Direct Preference Optimization) fine-tuning process using 70k and 20k data points respectively. A key differentiator for this model is its enhanced performance in German, achieved by feeding it with curated German data, while also maintaining strong English capabilities.

Key Capabilities & Features

  • Bilingual Proficiency: Optimized for both German and English language tasks.
  • DPO Alignment: Aligned using DPO for improved instruction following.
  • Llama-3 Base: Benefits from the robust architecture of Meta's Llama-3-8B-Instruct.
  • Quantized Versions: Available in HF, EXL2, and GGUF formats for flexible deployment.

Performance Highlights

  • Open LLM Leaderboard Average: Achieves an average score of 74.57 across ARC, HellaSwag, MMLU, TruthfulQA, Winogrande, and GSM8K.
  • MT-Bench English: Scores an average of 7.903125, with a slight reduction compared to the original Llama-3-8B-Instruct due to specific instruction training.
  • MT-Bench German: Demonstrates strong German conversational ability with an average score of 7.65625.
  • German RAG LLM Evaluation: Achieves an accuracy of 0.910, indicating strong performance in German Retrieval Augmented Generation tasks.

Use Cases

This model is particularly well-suited for applications requiring a capable instruction-following LLM with a strong emphasis on German language understanding and generation, alongside general English conversational abilities. Its fine-tuning with curated German data makes it a valuable asset for German-centric AI solutions.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p