PLLuM: A Family of Polish Large Language Models
CYFRAGOVPL/Llama-PLLuM-70B-chat-250801 is a 70 billion parameter large language model developed by the HIVE AI Consortium, building upon the Llama 3.1 architecture. This model is specifically designed and optimized for Polish and other Slavic/Baltic languages, while also incorporating English data for broader generalization. It features a 32768 token context length and has undergone rigorous instruction fine-tuning and preference learning using unique, high-quality Polish datasets.
Key Capabilities
- Specialized Multilingualism: Strong command of Polish, Slavic, and Baltic languages, with robust English generalization.
- High-Quality Training Data: Pretrained on approximately 150 billion tokens of Polish corpora, including 28 billion tokens available for open-source commercial use.
- Advanced Fine-Tuning: Utilizes ~55k manually curated "organic instructions" and a custom Polish preference corpus for enhanced safety, balance, and contextual appropriateness.
- Domain-Specific Excellence: Achieves top scores on custom benchmarks relevant to Polish public administration, demonstrating state-of-the-art performance in Polish-language tasks.
- Retrieval Augmented Generation (RAG): Additionally trained to perform well in RAG settings, providing document-cited answers.
Good for
- General Polish Language Tasks: Text generation, summarization, and question answering in Polish.
- Polish Public Administration: Developing domain-specific intelligent assistants and applications for government services.
- Research & Development: Serving as a foundational model for AI applications requiring strong Polish language capabilities.
- Dialogue Systems: The
-chat variant is aligned to human preferences for safer and more efficient use in conversational scenarios.