CYFRAGOVPL/PLLuM-12B-base

TEXT GENERATIONConcurrency Cost:1Model Size:12BQuant:FP8Ctx Length:32kPublished:Feb 7, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

The CYFRAGOVPL/PLLuM-12B-base is a 12 billion parameter base model from the PLLuM family, developed by a consortium of Polish scientific institutions. Built upon Mistral-Nemo-Base-2407, it is specialized in Polish and other Slavic/Baltic languages, with additional English data for broader generalization. This model is designed to generate contextually coherent text and serve as a foundation for specialized applications, particularly excelling in tasks relevant to Polish public administration.

Loading preview...

PLLuM-12B-base: A Polish-Centric Language Model

The CYFRAGOVPL/PLLuM-12B-base is a 12 billion parameter base model, part of the PLLuM family of large language models. Developed by a consortium of leading Polish scientific institutions, this model is built on the Mistral-Nemo-Base-2407 architecture and is specifically optimized for Polish and other Slavic/Baltic languages, while also incorporating English data for enhanced generalization.

Key Capabilities

  • Specialized Language Focus: Extensive pre-training on up to 150 billion tokens of Polish text, alongside Slavic, Baltic, and English data, making it highly proficient in these languages.
  • High-Quality Training Data: Utilizes a large-scale, high-quality text corpus, including a unique collection of ~40k manually created "organic instructions" in Polish for fine-tuning.
  • Robust Alignment: Benefits from the first Polish-language preference corpus, which teaches the model factual and linguistic correctness, balance, and safety, particularly for sensitive topics.
  • Strong Performance: Achieves top scores on custom benchmarks for Polish public administration tasks and state-of-the-art results in broader Polish-language evaluations.

Good For

  • General Language Tasks: Ideal for text generation, summarization, and question answering in Polish and related languages.
  • Domain-Specific Applications: Particularly effective for developing intelligent assistants and applications in Polish public administration, legal, and bureaucratic sectors.
  • Research and Development: Serves as a robust foundation for building downstream AI applications requiring strong command of the Polish language.