VAGOsolutions/SauerkrautLM-v2-14b-DPO
VAGOsolutions/SauerkrautLM-v2-14b-DPO is a 14.8 billion parameter DPO-tuned language model developed by VAGO solutions, based on the SauerkrautLM-v2-14b-SFT model. It features a three-phase training approach, enhancing English language performance while preserving German capabilities. This model is specifically optimized for improved handling of German function calling irrelevance, making it suitable for applications requiring robust bilingual performance and precise function call management.
Loading preview...
SauerkrautLM-v2-14b-DPO Overview
SauerkrautLM-v2-14b-DPO is a 14.8 billion parameter model from VAGO solutions, representing an advanced DPO-tuned iteration of the SauerkrautLM-v2-14b-SFT base model. It employs a comprehensive three-phase training methodology, building upon the initial two SFT phases with an added DPO phase. This DPO phase involved training on 80 million tokens, specifically targeting 15% of layers for fine-tuning.
Key Capabilities & Optimizations
- Enhanced English Performance: The DPO phase focused on optimizing English language capabilities.
- German Language Preservation: Despite English optimization, the model maintains strong German language performance.
- Improved German Function Calling: Features enhanced handling of irrelevance in German function calls.
- Three-Phase Training: Utilizes a unique training approach combining SFT and DPO for refined performance.
- Community Datasets: Two new German DPO datasets, "SauerkrautLM-Fermented-GER-DPO" and "SauerkrautLM-Fermented-Irrelevance-GER-DPO," were used in training and will be released to the community.
Intended Use Cases
This model is particularly well-suited for applications requiring:
- Robust performance in both English and German contexts.
- Precise and context-aware function calling, especially in German.
- Scenarios where maintaining bilingual proficiency is critical.