Name: VAGOsolutions/Llama-3-SauerkrautLM-8b-Instruct API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: VAGOsolutions

VAGOsolutions/Llama-3-SauerkrautLM-8b-Instruct Overview

VAGOsolutions/Llama-3-SauerkrautLM-8b-Instruct is an 8 billion parameter instruction-tuned model, a collaborative effort between VAGO Solutions and Hyperspace.ai. It is built upon Meta's Llama-3-8B-Instruct and has undergone a two-stage DPO (Direct Preference Optimization) fine-tuning process using 70k and 20k data points respectively. A key differentiator for this model is its enhanced performance in German, achieved by feeding it with curated German data, while also maintaining strong English capabilities.

Key Capabilities & Features

Bilingual Proficiency: Optimized for both German and English language tasks.
DPO Alignment: Aligned using DPO for improved instruction following.
Llama-3 Base: Benefits from the robust architecture of Meta's Llama-3-8B-Instruct.
Quantized Versions: Available in HF, EXL2, and GGUF formats for flexible deployment.

Performance Highlights

Open LLM Leaderboard Average: Achieves an average score of 74.57 across ARC, HellaSwag, MMLU, TruthfulQA, Winogrande, and GSM8K.
MT-Bench English: Scores an average of 7.903125, with a slight reduction compared to the original Llama-3-8B-Instruct due to specific instruction training.
MT-Bench German: Demonstrates strong German conversational ability with an average score of 7.65625.
German RAG LLM Evaluation: Achieves an accuracy of 0.910, indicating strong performance in German Retrieval Augmented Generation tasks.

Use Cases

This model is particularly well-suited for applications requiring a capable instruction-following LLM with a strong emphasis on German language understanding and generation, alongside general English conversational abilities. Its fine-tuning with curated German data makes it a valuable asset for German-centric AI solutions.

Overview

VAGOsolutions/Llama-3-SauerkrautLM-8b-Instruct Overview

Key Capabilities & Features

Performance Highlights

Use Cases

Full Model Card (README)