SauerkrautLM-7b-HerO: A Bilingual German-English LLM

VAGO solutions' SauerkrautLM-7b-HerO is a 7 billion parameter language model built upon the Mistral framework, specifically designed for high proficiency in both German and English. This model is a unique fusion of two high-performing 7B models: Teknium's OpenHermes-2.5-Mistral-7B and Open-Orca's Mistral-7B-OpenOrca. The merge was executed using the innovative gradient SLERP method from MergeKit, ensuring an optimal combination of their respective strengths.

Key Capabilities & Differentiators

Bilingual Proficiency: SauerkrautLM-7b-HerO was fine-tuned with a proprietary "Sauerkraut dataset" consisting of augmented and translated German data. This approach successfully taught the merged English-speaking model the intricacies of German without compromising its core English competencies, a common challenge in cross-lingual fine-tuning.
Optimized German Wording: The training dataset utilized data augmentation techniques to ensure grammatical and syntactical correctness, leading to more natural German phrasing compared to simple translation.
Strong Performance: Evaluation across various benchmarks, including MT-Bench (German and English), GPT4ALL, and Language Model Evaluation Harness, demonstrates its competitive performance against other German and general-purpose LLMs in its class.

Ideal Use Cases

German Language Applications: Excellent for tasks requiring nuanced understanding and generation of German text, such as customer support, content creation, and translation.
Bilingual Environments: Suitable for applications that need to operate seamlessly in both German and English, maintaining high performance across languages.
Research & Development: Provides a robust base for further fine-tuning or experimentation in bilingual language modeling, particularly for German-centric projects.

Overview

SauerkrautLM-7b-HerO: A Bilingual German-English LLM

Key Capabilities & Differentiators

Ideal Use Cases

Full Model Card (README)