flair/bueble-lm-2b
TEXT GENERATIONConcurrency Cost:1Model Size:2.6BQuant:BF16Ctx Length:8kPublished:Sep 15, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

BübleLM is a 2.6 billion parameter German language model developed by flair, based on the Gemma-2-2B architecture. It utilizes a custom German SentencePiece tokenizer and was trained on 3.5 billion tokens of German web content, legislative documents, and news. This model significantly improves performance on German benchmarks like HellaSwag-DE and ARC-DE compared to its base model, making it optimized for German language understanding and generation tasks.

Loading preview...