Name: CYFRAGOVPL/Llama-PLLuM-70B-instruct API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: CYFRAGOVPL

PLLuM: Polish Large Language Models

CYFRAGOVPL/Llama-PLLuM-70B-instruct is a 70 billion parameter instruction-tuned model from the PLLuM family, developed by a consortium of Polish scientific institutions led by Politechnika Wrocławska. This model is built on the Llama 3.1 architecture and is specifically optimized for Polish and other Slavic/Baltic languages, incorporating English data for broader generalization. It features a 32768 token context length.

Key Capabilities

Extensive Polish Data Training: Pretrained on up to 150 billion tokens of Polish text, alongside Slavic, Baltic, and English data.
Organic Instruction Tuning: Fine-tuned with approximately 40,000 manually created Polish instruction-response pairs, including multi-turn dialogues, to mitigate negative linguistic transfer.
Polish Preference Corpus: Utilizes the first Polish-language preference corpus for alignment, enhancing correctness, balance, and safety, especially for sensitive topics.
State-of-the-Art Polish Performance: Achieves top scores on custom benchmarks relevant to Polish public administration and state-of-the-art results in broader Polish-language tasks.
RAG-based Adaptations: Designed to perform well in Retrieval Augmented Generation (RAG) settings, with specialized RAG-based models developed for domains like public administration.

Good For

General Polish Language Tasks: Text generation, summarization, and question answering in Polish.
Domain-Specific Assistants: Particularly effective for applications related to Polish public administration, legal, and bureaucratic topics.
Research & Development: Serving as a foundational model for AI applications requiring strong Polish language capabilities.

Overview

PLLuM: Polish Large Language Models

Key Capabilities

Good For

Full Model Card (README)