Name: CYFRAGOVPL/Llama-PLLuM-8B-instruct-2512 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: CYFRAGOVPL

What is Llama-PLLuM-8B-instruct-2512?

Llama-PLLuM-8B-instruct-2512 is an 8 billion parameter instruction-tuned large language model, part of the PLLuM family developed by CYFRAGOVPL. It is built upon the Llama-3.1-8B base model and is specifically designed for the Polish language, while also incorporating English data for broader generalization. The model's development involved extensive data collection, rigorous cleaning, and deduplication of Polish and English text corpora.

Key Capabilities

Polish Language Specialization: Developed with a focus on high-quality Polish text data, including a large collection of manually created "organic instructions" and the first Polish-language preference corpus.
Instruction Tuning: Fine-tuned using approximately 70k manually curated Polish instructions, 33k programmatically derived instructions, 15k RAG-style context-processing instructions, and 45k synthetic, context-aware instructions.
Alignment and Safety: Utilizes ~60k manually annotated preference pairs to ensure safer, balanced, and contextually appropriate responses, even for sensitive topics.
Strong Performance: Achieves top scores on custom benchmarks relevant to Polish public administration and state-of-the-art results in broader Polish-language tasks.

Good For

General Language Tasks: Ideal for text generation, summarization, extraction, and question answering in Polish.
Domain-Specific Assistants: Particularly effective for applications within Polish public administration, legal, or bureaucratic contexts requiring domain-aware retrieval.
Research & Development: Serves as a robust foundation for AI applications demanding strong command of the Polish language.

Overview

What is Llama-PLLuM-8B-instruct-2512?

Key Capabilities

Good For

Full Model Card (README)