VAGOsolutions/SauerkrautLM-7b-HerO

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Nov 24, 2023License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

VAGOsolutions/SauerkrautLM-7b-HerO is a 7 billion parameter German-English bilingual language model based on the Mistral architecture, developed by VAGO solutions. It was created by merging Teknium's OpenHermes-2.5-Mistral-7B and Open-Orca's Mistral-7B-OpenOrca using the gradient SLERP method, then fine-tuned with a unique augmented German dataset. This model excels in German language understanding while retaining strong English capabilities, setting a new benchmark in bilingual proficiency without typical performance loss.

Loading preview...

SauerkrautLM-7b-HerO: A Bilingual German-English LLM

VAGO solutions' SauerkrautLM-7b-HerO is a 7 billion parameter language model built upon the Mistral framework, specifically designed for high proficiency in both German and English. This model is a unique fusion of two high-performing 7B models: Teknium's OpenHermes-2.5-Mistral-7B and Open-Orca's Mistral-7B-OpenOrca. The merge was executed using the innovative gradient SLERP method from MergeKit, ensuring an optimal combination of their respective strengths.

Key Capabilities & Differentiators

  • Bilingual Proficiency: SauerkrautLM-7b-HerO was fine-tuned with a proprietary "Sauerkraut dataset" consisting of augmented and translated German data. This approach successfully taught the merged English-speaking model the intricacies of German without compromising its core English competencies, a common challenge in cross-lingual fine-tuning.
  • Optimized German Wording: The training dataset utilized data augmentation techniques to ensure grammatical and syntactical correctness, leading to more natural German phrasing compared to simple translation.
  • Strong Performance: Evaluation across various benchmarks, including MT-Bench (German and English), GPT4ALL, and Language Model Evaluation Harness, demonstrates its competitive performance against other German and general-purpose LLMs in its class.

Ideal Use Cases

  • German Language Applications: Excellent for tasks requiring nuanced understanding and generation of German text, such as customer support, content creation, and translation.
  • Bilingual Environments: Suitable for applications that need to operate seamlessly in both German and English, maintaining high performance across languages.
  • Research & Development: Provides a robust base for further fine-tuning or experimentation in bilingual language modeling, particularly for German-centric projects.