SauerkrautLM-7b-HerO: A Bilingual German-English LLM
VAGO solutions' SauerkrautLM-7b-HerO is a 7 billion parameter language model built upon the Mistral framework, specifically designed for high proficiency in both German and English. This model is a unique fusion of two high-performing 7B models: Teknium's OpenHermes-2.5-Mistral-7B and Open-Orca's Mistral-7B-OpenOrca. The merge was executed using the innovative gradient SLERP method from MergeKit, ensuring an optimal combination of their respective strengths.
Key Capabilities & Differentiators
- Bilingual Proficiency: SauerkrautLM-7b-HerO was fine-tuned with a proprietary "Sauerkraut dataset" consisting of augmented and translated German data. This approach successfully taught the merged English-speaking model the intricacies of German without compromising its core English competencies, a common challenge in cross-lingual fine-tuning.
- Optimized German Wording: The training dataset utilized data augmentation techniques to ensure grammatical and syntactical correctness, leading to more natural German phrasing compared to simple translation.
- Strong Performance: Evaluation across various benchmarks, including MT-Bench (German and English), GPT4ALL, and Language Model Evaluation Harness, demonstrates its competitive performance against other German and general-purpose LLMs in its class.
Ideal Use Cases
- German Language Applications: Excellent for tasks requiring nuanced understanding and generation of German text, such as customer support, content creation, and translation.
- Bilingual Environments: Suitable for applications that need to operate seamlessly in both German and English, maintaining high performance across languages.
- Research & Development: Provides a robust base for further fine-tuning or experimentation in bilingual language modeling, particularly for German-centric projects.