CYFRAGOVPL/Llama-PLLuM-70B-instruct
CYFRAGOVPL/Llama-PLLuM-70B-instruct is a 70 billion parameter instruction-tuned large language model from the PLLuM family, developed by a consortium of Polish institutions including Politechnika Wrocławska. Built upon the Llama 3.1 architecture, it is specialized in Polish and other Slavic/Baltic languages, with additional English data for generalization. This model excels at generating contextually coherent text and assisting in tasks like question answering and summarization, particularly for Polish public administration and general Polish-language applications.
Loading preview...
PLLuM: Polish Large Language Models
CYFRAGOVPL/Llama-PLLuM-70B-instruct is a 70 billion parameter instruction-tuned model from the PLLuM family, developed by a consortium of Polish scientific institutions led by Politechnika Wrocławska. This model is built on the Llama 3.1 architecture and is specifically optimized for Polish and other Slavic/Baltic languages, incorporating English data for broader generalization. It features a 32768 token context length.
Key Capabilities
- Extensive Polish Data Training: Pretrained on up to 150 billion tokens of Polish text, alongside Slavic, Baltic, and English data.
- Organic Instruction Tuning: Fine-tuned with approximately 40,000 manually created Polish instruction-response pairs, including multi-turn dialogues, to mitigate negative linguistic transfer.
- Polish Preference Corpus: Utilizes the first Polish-language preference corpus for alignment, enhancing correctness, balance, and safety, especially for sensitive topics.
- State-of-the-Art Polish Performance: Achieves top scores on custom benchmarks relevant to Polish public administration and state-of-the-art results in broader Polish-language tasks.
- RAG-based Adaptations: Designed to perform well in Retrieval Augmented Generation (RAG) settings, with specialized RAG-based models developed for domains like public administration.
Good For
- General Polish Language Tasks: Text generation, summarization, and question answering in Polish.
- Domain-Specific Assistants: Particularly effective for applications related to Polish public administration, legal, and bureaucratic topics.
- Research & Development: Serving as a foundational model for AI applications requiring strong Polish language capabilities.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.