Name: CYFRAGOVPL/PLLuM-12B-nc-chat API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: CYFRAGOVPL

PLLuM-12B-nc-chat: Polish-Centric LLM for Non-Commercial Use

PLLuM-12B-nc-chat is a 12 billion parameter large language model, part of the PLLuM family developed by a consortium of Polish institutions led by Politechnika Wrocławska. Built on the Mistral-Nemo-Base-2407 architecture, this model is specifically designed for Polish and other Slavic/Baltic languages, incorporating additional English data for broader generalization. It features a 32768 token context length.

Key Capabilities

Specialized in Polish: Pretrained on approximately 150 billion tokens of Polish text, along with Slavic, Baltic, and English data.
Advanced Alignment: Fine-tuned using a unique, manually curated dataset of ~40k Polish "organic instructions" and the first Polish-language preference corpus, enhancing correctness, balance, and safety.
High Performance: Achieves state-of-the-art results in Polish-language tasks and top scores on custom benchmarks relevant to Polish public administration.
Chat Optimized: The -chat suffix indicates it's aligned on human preferences, making it safer and more efficient for dialogue and general-purpose scenarios.

Good For

General Polish Language Tasks: Text generation, summarization, and question answering in Polish.
Domain-Specific Assistants: Particularly effective for applications related to Polish public administration, legal, or bureaucratic topics, especially when paired with RAG.
Research & Development: A strong foundation for academic or industrial AI applications requiring robust Polish language capabilities, under a CC-BY-NC-4.0 license.