Name: VAGOsolutions/SauerkrautLM-1.5b API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: VAGOsolutions

SauerkrautLM-1.5b Overview

SauerkrautLM-1.5b is a 1.5 billion parameter language model developed by VAGO solutions, built upon the Qwen/Qwen2-1.5B architecture. Its primary differentiator is the use of Spectrum Continuous Pre-Training (CPT) on German data, targeting only 25% of the model's layers. This innovative approach significantly reduces training resource consumption while effectively enhancing German language proficiency.

Key Capabilities & Training Insights

Resource-Efficient Multilingualism: Achieves substantial improvements in German language skills with a fraction of the resources typically required for full CPT. Training on 6.1 billion German tokens cost $1152 in GPU-rent for CPT.
Performance: In German RAG evaluations, it performs comparably to 8 billion parameter models. It also maintains or surpasses the performance of the base Qwen2-1.5B-Instruct model in some English benchmarks.
Mobile Deployment: Its compact 1.5 billion parameter size makes it well-suited for deployment on smartphones and tablets.
Training Process: After CPT, the model underwent 3 epochs of Supervised Fine-Tuning (SFT) with 700K samples and was further aligned with Direct Preference Optimization (DPO) using 70K samples.

Why Use SauerkrautLM-1.5b?

German Language Applications: Ideal for use cases requiring strong German language understanding and generation, especially where resource efficiency is critical.
Edge Device Deployment: Its small size and optimized performance make it a strong candidate for on-device AI applications.
Demonstration of Efficient Training: Serves as a practical example of how advanced CPT techniques can efficiently adapt LLMs to new languages without significant performance degradation in their original language.