VAGOsolutions/Llama-3-SauerkrautLM-8b-Instruct Overview
VAGOsolutions/Llama-3-SauerkrautLM-8b-Instruct is an 8 billion parameter instruction-tuned model, a collaborative effort between VAGO Solutions and Hyperspace.ai. It is built upon Meta's Llama-3-8B-Instruct and has undergone a two-stage DPO (Direct Preference Optimization) fine-tuning process using 70k and 20k data points respectively. A key differentiator for this model is its enhanced performance in German, achieved by feeding it with curated German data, while also maintaining strong English capabilities.
Key Capabilities & Features
- Bilingual Proficiency: Optimized for both German and English language tasks.
- DPO Alignment: Aligned using DPO for improved instruction following.
- Llama-3 Base: Benefits from the robust architecture of Meta's Llama-3-8B-Instruct.
- Quantized Versions: Available in HF, EXL2, and GGUF formats for flexible deployment.
Performance Highlights
- Open LLM Leaderboard Average: Achieves an average score of 74.57 across ARC, HellaSwag, MMLU, TruthfulQA, Winogrande, and GSM8K.
- MT-Bench English: Scores an average of 7.903125, with a slight reduction compared to the original Llama-3-8B-Instruct due to specific instruction training.
- MT-Bench German: Demonstrates strong German conversational ability with an average score of 7.65625.
- German RAG LLM Evaluation: Achieves an accuracy of 0.910, indicating strong performance in German Retrieval Augmented Generation tasks.
Use Cases
This model is particularly well-suited for applications requiring a capable instruction-following LLM with a strong emphasis on German language understanding and generation, alongside general English conversational abilities. Its fine-tuning with curated German data makes it a valuable asset for German-centric AI solutions.