Overview
PLLuM-12B-chat: Polish-Specialized LLM
CYFRAGOVPL/PLLuM-12B-chat is a 12 billion parameter large language model, part of the PLLuM family, developed by CYFRAGOVPL. It is built upon the Mistral-Nemo-Base-2407 architecture and is specifically optimized for Polish and other Slavic/Baltic languages, while also incorporating English data for broader generalization. The model has undergone extensive pretraining on up to 150 billion tokens of Polish text (for CC-BY-NC-4.0 licensed models, 30 billion for Apache 2.0 licensed models) and refined through instruction fine-tuning using a unique dataset of ~40k manually created "organic instructions" in Polish, alongside synthetic and premium Polish corpora.
Key Capabilities
- Polish Language Mastery: Achieves state-of-the-art results in various Polish-language tasks, including text generation, summarization, and question answering.
- Domain-Specific Excellence: Demonstrates top-tier performance in specialized tasks relevant to Polish public administration, making it suitable for bureaucratic and legal topics.
- Robust Alignment: Features a manually curated Polish-language preference corpus, enhancing safety, balance, and contextual appropriateness, even for sensitive or adversarial prompts.
- Chat-Optimized: The "-chat" variant is specifically aligned on human preferences for more efficient and safer use in dialogue and general-purpose conversational scenarios.
- Retrieval Augmented Generation (RAG) Support: Designed to perform well in RAG settings, providing structured responses with document citations when relevant information is available.
Good for
- Applications requiring high-quality, contextually aware text generation in Polish.
- Developing intelligent assistants for Polish public administration or legal sectors.
- Research and development of AI applications where strong command of the Polish language is crucial.
- General language tasks such as summarization and question answering in Polish.