CYFRAGOVPL/Llama-PLLuM-70B-instruct

Cold
Public
70B
FP8
32768
License: llama3.1
Hugging Face
Overview

PLLuM: Polish Large Language Models

CYFRAGOVPL/Llama-PLLuM-70B-instruct is a 70 billion parameter instruction-tuned model from the PLLuM family, developed by a consortium of Polish scientific institutions led by Politechnika Wrocławska. This model is built on the Llama 3.1 architecture and is specifically optimized for Polish and other Slavic/Baltic languages, incorporating English data for broader generalization. It features a 32768 token context length.

Key Capabilities

  • Extensive Polish Data Training: Pretrained on up to 150 billion tokens of Polish text, alongside Slavic, Baltic, and English data.
  • Organic Instruction Tuning: Fine-tuned with approximately 40,000 manually created Polish instruction-response pairs, including multi-turn dialogues, to mitigate negative linguistic transfer.
  • Polish Preference Corpus: Utilizes the first Polish-language preference corpus for alignment, enhancing correctness, balance, and safety, especially for sensitive topics.
  • State-of-the-Art Polish Performance: Achieves top scores on custom benchmarks relevant to Polish public administration and state-of-the-art results in broader Polish-language tasks.
  • RAG-based Adaptations: Designed to perform well in Retrieval Augmented Generation (RAG) settings, with specialized RAG-based models developed for domains like public administration.

Good For

  • General Polish Language Tasks: Text generation, summarization, and question answering in Polish.
  • Domain-Specific Assistants: Particularly effective for applications related to Polish public administration, legal, and bureaucratic topics.
  • Research & Development: Serving as a foundational model for AI applications requiring strong Polish language capabilities.