Name: flair/bueble-lm-2b API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: flair

Overview

BübleLM is a 2.6 billion parameter German language model built upon the Gemma-2-2B architecture. Developed by flair, its key innovation lies in its adaptation using trans-tokenization with a custom German SentencePiece tokenizer, which significantly enhances its performance in German language tasks. The model was trained on 3.5 billion tokens from the Occiglot-FineWeb project, encompassing diverse German data sources like web content, legislative documents, news, and Wikipedia.

Key Capabilities & Performance

German Language Optimization: Achieves substantial improvements over the base Gemma-2-2B model on German benchmarks, including a +71% increase on HellaSwag-DE and +41% on ARC-DE.
Custom Tokenization: Employs a 20k vocabulary German SentencePiece tokenizer, optimized for German morphological structures, leading to better token efficiency.
Context Length: Supports an 8192-token context window.
Outperforms Alternatives: Consistently surpasses both the base Gemma-2-2B and other German models like LLäMmlein-1B in most evaluated tasks.

Usage & Limitations

BübleLM is a base language model, not instruction-tuned, meaning it is best suited for text completion rather than chat or instruction following without further fine-tuning. Its limitations include a relatively smaller vocabulary size (20k) compared to multilingual models and potential performance variations on highly specialized domains not well-represented in its training data.

Overview

Overview

Key Capabilities & Performance

Usage & Limitations

Full Model Card (README)