LSX-UniWue/LLaMmlein_1B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1.1BQuant:BF16Ctx Length:2kPublished:Jul 4, 2025License:otherArchitecture:Transformer0.0K Warm

LLaMmlein 1B is a 1.1 billion parameter German LLaMa model developed by LSX-UniWue, trained from scratch on a deduplicated and filtered German portion of the RedPajama V2 dataset. This model is specifically designed for German language tasks, offering a specialized alternative to general-purpose LLMs. Its primary use case is research and development in German natural language processing, leveraging its focused training for improved performance in this domain.

Loading preview...

LLäMmlein 1B Overview

LLäMmlein 1B is a 1.1 billion parameter German LLaMa model developed by LSX-UniWue. It was trained from scratch using an adapted TinyLlama codebase on a carefully curated German subset of the RedPajama V2 dataset. The dataset underwent rigorous deduplication at the paragraph level and filtering using a token-to-word ratio to enhance data quality.

Key Capabilities

  • German Language Specialization: Optimized for tasks requiring deep understanding and generation of German text.
  • Research-Oriented: Provides access to intermediate checkpoints throughout the training process, including associated data points, which is valuable for research into model development and learning.
  • Efficient Deployment: Compatible with the transformers library, with optional flash-attn support for improved efficiency.

Good for

  • German NLP Research: Ideal for academic and research projects focusing on German language models and their training dynamics.
  • Comparative Studies: Useful for comparing performance against other multilingual or German-specific LLMs.
  • Exploring Training Progression: Researchers can analyze model behavior at different training stages using the provided intermediate checkpoints and logged data.