SauerkrautLM-v2-14b-DPO Overview

SauerkrautLM-v2-14b-DPO is a 14.8 billion parameter model from VAGO solutions, representing an advanced DPO-tuned iteration of the SauerkrautLM-v2-14b-SFT base model. It employs a comprehensive three-phase training methodology, building upon the initial two SFT phases with an added DPO phase. This DPO phase involved training on 80 million tokens, specifically targeting 15% of layers for fine-tuning.

Key Capabilities & Optimizations

Enhanced English Performance: The DPO phase focused on optimizing English language capabilities.
German Language Preservation: Despite English optimization, the model maintains strong German language performance.
Improved German Function Calling: Features enhanced handling of irrelevance in German function calls.
Three-Phase Training: Utilizes a unique training approach combining SFT and DPO for refined performance.
Community Datasets: Two new German DPO datasets, "SauerkrautLM-Fermented-GER-DPO" and "SauerkrautLM-Fermented-Irrelevance-GER-DPO," were used in training and will be released to the community.

Intended Use Cases

This model is particularly well-suited for applications requiring:

Robust performance in both English and German contexts.
Precise and context-aware function calling, especially in German.
Scenarios where maintaining bilingual proficiency is critical.

Overview

SauerkrautLM-v2-14b-DPO Overview

Key Capabilities & Optimizations

Intended Use Cases

Full Model Card (README)