Name: CYFRAGOVPL/Llama-PLLuM-70B-base-2412 API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: CYFRAGOVPL

PLLuM: Polish Large Language Models

CYFRAGOVPL/Llama-PLLuM-70B-base-2412 is a 70 billion parameter base model within the PLLuM family, developed by a consortium of Polish scientific institutions. It is built on the Llama 3.1 architecture and is specialized for Polish and other Slavic/Baltic languages, with additional English data for broader generalization. The model was continued-pretrained on extensive, high-quality Polish text corpora (up to 150 billion tokens).

Key Capabilities

Multilingual Specialization: Strong performance in Polish and other Slavic/Baltic languages.
High-Quality Training Data: Utilizes large-scale, cleaned, and deduplicated Polish text data.
Advanced Alignment: Refined through an organic instruction dataset of ~40k Polish prompt-response pairs and the first Polish-language preference corpus, enhancing correctness, balance, and safety.
Domain-Specific Excellence: Achieves top scores on custom benchmarks relevant to Polish public administration tasks.

What Makes This Model Different?

Unlike many general-purpose LLMs, PLLuM models are specifically engineered for the nuances of the Polish language and its related linguistic contexts. The development includes unique, manually curated Polish instruction and preference datasets, which mitigate negative linguistic transfer and ensure high-quality, contextually appropriate responses. Its strong performance in Polish public administration tasks highlights its specialized utility.

Should I use this for my use case?

Yes, if: Your application requires strong performance in Polish language generation, summarization, or question answering. It is particularly well-suited for tasks related to Polish public administration, legal, or bureaucratic domains, especially when combined with RAG. Researchers and developers building downstream AI applications where a robust command of Polish is essential will find this model valuable.
Consider alternatives if: Your primary use case is exclusively in English or other languages where more specialized models exist, or if you require a model with a fully open-source license for commercial use that was trained on the smaller 28B token dataset (this specific 70B model uses the Llama 3.1 license and was trained on the larger 150B token dataset).

Overview

PLLuM: Polish Large Language Models

Key Capabilities

What Makes This Model Different?

Should I use this for my use case?

Full Model Card (README)