VAGOsolutions/SauerkrautLM-SOLAR-Instruct

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:10.7BQuant:FP8Ctx Length:4kPublished:Dec 20, 2023License:cc-by-nc-4.0Architecture:Transformer0.0K Open Weights Warm

VAGOsolutions/SauerkrautLM-SOLAR-Instruct is a 10.7 billion parameter instruction-tuned causal language model developed by VAGO solutions, based on the Upstage SOLAR-10.7B-Instruct-v1.0 architecture. This model is specifically fine-tuned and aligned with DPO using augmented German datasets, enhancing its grammatical and syntactical correctness in German. It excels in German language tasks while maintaining strong performance across general benchmarks, making it suitable for applications requiring high-quality German language generation.

Loading preview...

SauerkrautLM-SOLAR-Instruct Overview

SauerkrautLM-SOLAR-Instruct is a 10.7 billion parameter instruction-tuned model developed by VAGO solutions, building upon the Upstage SOLAR-10.7B-Instruct-v1.0 base. Its primary differentiator is its enhanced German language capabilities, achieved through fine-tuning with a specialized mix of German data augmentation and translated datasets. This process, including alignment via DPO with the German SauerkrautLM-DPO dataset, addresses the common issue of unnatural German phrasings often resulting from simple translation.

Key Capabilities

  • Improved German Language Proficiency: Specifically trained to produce grammatically and syntactically correct German with natural wording.
  • DPO Alignment: Utilizes Direct Preference Optimization with a unique German DPO dataset for refined instruction following.
  • Multilingual Support: Supports both English and German, with a focus on German quality.
  • Contamination-Free Training: Rigorous data contamination tests confirm the integrity of its training datasets, particularly for ARC, MMLU, TruthfulQA, and GSM8K.

Good For

  • Applications requiring high-quality German text generation and understanding.
  • Use cases where a robust, instruction-following model with strong German linguistic accuracy is crucial.
  • Developers seeking a 10.7B parameter model that balances general performance with specialized German language optimization.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p