CYFRAGOVPL/PLLuM-12B-instruct

Warm
Public
12B
FP8
32768
Feb 7, 2025
License: apache-2.0
Hugging Face
Overview

What is PLLuM-12B-instruct?

CYFRAGOVPL/PLLuM-12B-instruct is a 12 billion parameter instruction-tuned model from the PLLuM family, developed by a consortium of Polish scientific institutions. It is built upon the Mistral-Nemo-Base-2407 architecture and is specifically optimized for Polish and other Slavic/Baltic languages, while also incorporating English data for broader generalization. The model has a context length of 32768 tokens.

Key Capabilities & Differentiators

  • Polish Language Specialization: Trained on extensive, high-quality Polish text corpora (up to 150B tokens for some models, 28B for fully open-source variants) and refined with a large, manually curated "organic instructions" dataset (~40k prompt-response pairs) in Polish.
  • Advanced Alignment: Features the first Polish-language preference corpus, enabling the model to learn correctness, balance, and safety, especially for sensitive topics.
  • Strong Performance: Achieves state-of-the-art results in broader Polish-language tasks and top scores on custom benchmarks relevant to Polish public administration.
  • Instruction-Tuned: Refined using a combination of manually curated, premium, and synthetic Polish instructions for enhanced task performance.

Should I use this for my use case?

  • Good for: General language tasks in Polish (text generation, summarization, question answering), developing domain-specific assistants for Polish public administration, legal, or bureaucratic topics, and research & development requiring strong Polish language command. Its Apache 2.0 license allows for commercial use.
  • Considerations: Like all LLMs, it may exhibit hallucinations or biases. While extensive alignment has been performed, human oversight is advised for sensitive applications.