CYFRAGOVPL/PLLuM-12B-nc-chat-2412

TEXT GENERATIONConcurrency Cost:1Model Size:12BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Feb 7, 2025License:cc-by-nc-4.0Architecture:Transformer0.0K Open Weights Gated Cold

The CYFRAGOVPL/PLLuM-12B-nc-chat-2412 is a 12 billion parameter large language model from the PLLuM family, developed by a consortium of Polish institutions led by Politechnika Wrocławska. Based on Mistral-Nemo-Base-2407, it is specialized for Polish and other Slavic/Baltic languages, with additional English data, and features a 32768 token context length. This chat-optimized model is aligned on human preferences for safer, more efficient dialogue in general-purpose scenarios, excelling in Polish language tasks and public administration applications.

Loading preview...

PLLuM-12B-nc-chat-2412: Polish-Centric LLM for Dialogue

The PLLuM-12B-nc-chat-2412 is a 12 billion parameter model from the PLLuM family, developed by a consortium of Polish scientific institutions led by Politechnika Wrocławska. It is built upon the Mistral-Nemo-Base-2407 architecture and is specifically designed for Polish and other Slavic/Baltic languages, incorporating English data for broader generalization. This model is instruction-tuned and aligned on human preferences, making it suitable for dialogue and general-purpose conversational AI.

Key Capabilities

  • Specialized Multilingualism: Optimized for Polish, Slavic, and Baltic languages, with strong performance in Polish-specific tasks.
  • Extensive Training Data: Pretrained on approximately 150 billion tokens of Polish text, along with additional Slavic, Baltic, and English data.
  • Organic Instruction Tuning: Refined using a unique, manually curated dataset of ~40k Polish "organic instructions" and ~3.5k multi-turn dialogues, designed to mitigate negative linguistic transfer.
  • Preference Learning: Features the first Polish-language preference corpus, teaching the model correctness, balance, and safety, particularly for controversial topics.
  • Strong Performance: Achieves top scores on custom benchmarks relevant to Polish public administration and state-of-the-art results in broader Polish-language tasks.
  • Dialogue Optimization: The "-chat" suffix indicates alignment on human preferences for safer and more efficient use in conversational scenarios.

Good for

  • General Language Tasks: Text generation, summarization, and question answering in Polish.
  • Domain-Specific Assistants: Particularly effective for applications in Polish public administration, legal, and bureaucratic contexts.
  • Research & Development: Serving as a foundational model for AI applications requiring strong Polish language capabilities.

This model is released under the CC-BY-NC-4.0 license, indicating non-commercial use.