CYFRAGOVPL/PLLuM-4B-base-2512
The PLLuM-4B-base-2512 is a 4.3 billion parameter large language model developed by the PLLuM consortium and continued by HIVE AI, specialized in Polish with additional English data. Based on Google's Gemma-3-4b-pt, it is designed to generate contextually coherent text and assist in tasks like question answering and summarization, particularly excelling in Polish public administration contexts. The model was trained on high-quality Polish and English corpora, refined through extensive instruction tuning, and aligned using a unique Polish preference corpus.
Loading preview...
PLLuM-4B-base-2512: Polish Language Model
The PLLuM-4B-base-2512 is a 4.3 billion parameter model from the PLLuM family, developed by the PLLuM consortium and later HIVE AI. It is built upon Google's Gemma-3-4b-pt and is specifically designed for the Polish language, incorporating additional English data for broader generalization. The model's development involved rigorous data collection, including large-scale, high-quality Polish and English text corpora, with a focus on cleaning and deduplication.
Key Capabilities
- Specialized Polish Language Understanding: Optimized for generating contextually coherent text in Polish.
- Extensive Instruction Tuning: Fine-tuned with approximately 70k manually curated Polish "organic instructions," 33k programmatically derived instructions, 15k RAG-style context-processing instructions, and 45k synthetic, context-aware instructions.
- Preference Learning: Aligned using ~60k manually annotated Polish preference pairs to ensure safer, balanced, and contextually appropriate responses, even for sensitive topics.
- Strong Performance: Achieves top scores on custom benchmarks relevant to Polish public administration and state-of-the-art results in broader Polish-language tasks.
- RAG Optimization: Trained to perform well in Retrieval-Augmented Generation (RAG) settings, providing answers based on provided documents with citations.
Good for
- General Language Tasks: Text generation, summarization, extraction, and question answering in Polish.
- Domain-Specific Assistants: Particularly effective for applications within Polish public administration, legal, or bureaucratic domains.
- Research & Development: Serves as a foundational model for downstream AI applications requiring strong Polish language capabilities.