VAGOsolutions/Llama-3-SauerkrautLM-8b-Instruct

Warm
Public
8B
FP8
8192
License: llama3
Hugging Face
Overview

VAGOsolutions/Llama-3-SauerkrautLM-8b-Instruct Overview

VAGOsolutions/Llama-3-SauerkrautLM-8b-Instruct is an 8 billion parameter instruction-tuned model, a collaborative effort between VAGO Solutions and Hyperspace.ai. It is built upon Meta's Llama-3-8B-Instruct and has undergone a two-stage DPO (Direct Preference Optimization) fine-tuning process using 70k and 20k data points respectively. A key differentiator for this model is its enhanced performance in German, achieved by feeding it with curated German data, while also maintaining strong English capabilities.

Key Capabilities & Features

  • Bilingual Proficiency: Optimized for both German and English language tasks.
  • DPO Alignment: Aligned using DPO for improved instruction following.
  • Llama-3 Base: Benefits from the robust architecture of Meta's Llama-3-8B-Instruct.
  • Quantized Versions: Available in HF, EXL2, and GGUF formats for flexible deployment.

Performance Highlights

  • Open LLM Leaderboard Average: Achieves an average score of 74.57 across ARC, HellaSwag, MMLU, TruthfulQA, Winogrande, and GSM8K.
  • MT-Bench English: Scores an average of 7.903125, with a slight reduction compared to the original Llama-3-8B-Instruct due to specific instruction training.
  • MT-Bench German: Demonstrates strong German conversational ability with an average score of 7.65625.
  • German RAG LLM Evaluation: Achieves an accuracy of 0.910, indicating strong performance in German Retrieval Augmented Generation tasks.

Use Cases

This model is particularly well-suited for applications requiring a capable instruction-following LLM with a strong emphasis on German language understanding and generation, alongside general English conversational abilities. Its fine-tuning with curated German data makes it a valuable asset for German-centric AI solutions.