VAGOsolutions/SauerkrautLM-gemma-2-9b-it

TEXT GENERATIONConcurrency Cost:1Model Size:9BQuant:FP8Ctx Length:16kPublished:Aug 12, 2024License:gemmaArchitecture:Transformer0.0K Cold

SauerkrautLM-gemma-2-9b-it is a 9 billion parameter instruction-tuned model developed by VAGO solutions, fine-tuned from Google's Gemma-2-9b-it. It leverages Spectrum Fine-Tuning on 25% of its layers using a unique German-English "Sauerkraut Mix v2" dataset. This model demonstrates resource-efficient enhancement of instruction-following, common-sense reasoning, and math capabilities, making it suitable for bilingual applications requiring optimized performance.

Loading preview...

SauerkrautLM-gemma-2-9b-it: Resource-Efficient Bilingual Fine-Tuning

SauerkrautLM-gemma-2-9b-it is a 9 billion parameter instruction-tuned model from VAGO solutions, built upon Google's Gemma-2-9b-it. This model showcases the effectiveness of Spectrum Fine-Tuning for enhancing large language models with reduced computational resources.

Key Capabilities & Training:

  • Bilingual Proficiency: Fine-tuned on a proprietary "German-English Sauerkraut Mix v2" dataset, focusing on high-quality German and English data.
  • Resource-Efficient Enhancement: Utilizes Spectrum Fine-Tuning, targeting only 25% of the model's layers, to efficiently improve capabilities.
  • Improved Performance: Demonstrates significant enhancements in instruction-following, common-sense reasoning, and mathematical tasks, as indicated by internal evaluations (AGIEVAL, GPT4ALL, TRUTHFULQA, OPENLEADERBOARD 2, MMLU 5-shot).
  • Preserves Knowledge: The fine-tuning approach is designed to enhance specific skills while largely preserving the model's existing knowledge base.

Ideal Use Cases:

  • Applications requiring strong performance in both German and English.
  • Scenarios where resource-efficient model enhancement is critical.
  • Tasks benefiting from improved instruction-following and reasoning abilities.