radlab/pLLama3.2-3B-DPO

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.2BQuant:BF16Ctx Length:32kPublished:Oct 17, 2024License:llama3.2Architecture:Transformer Warm

radlab/pLLama3.2-3B-DPO is a 3.2 billion parameter language model developed by radlab, fine-tuned and optimized for the Polish language. This model underwent a two-stage training process, including fine-tuning on 650,000 Polish instructions and subsequent DPO (Direct Preference Optimization) on 100,000 examples focused on correct Polish writing. It excels at generating precise and grammatically sound Polish text, making it ideal for applications requiring high-quality Polish language generation and communication.

Loading preview...

radlab/pLLama3.2-3B-DPO: Polish Language Optimized Model

radlab/pLLama3.2-3B-DPO is a 3.2 billion parameter model from the radlab/pLLama3.2 collection, specifically trained and optimized for the Polish language. It aims to provide more precise communication in Polish compared to the base Meta-Llama-3.2 models. This model is the DPO-processed version, building upon an initial fine-tuning stage.

Key Capabilities

  • Enhanced Polish Communication: Designed to interact more precisely and accurately in Polish.
  • Grammar and Style Correction: Benefited from a DPO process focused on selecting correctly written Polish texts over those with linguistic errors.
  • Specialized Polish Datasets: Trained on a unique, semi-automatically generated dataset of approximately 650,000 Polish instructions, in addition to publicly available datasets.
  • Two-Stage Training: Underwent initial fine-tuning for 5 epochs on instruction data, followed by 15,000 steps of DPO on a 100,000-example dataset for correct writing.

Good for

  • Applications requiring high-quality Polish text generation.
  • Chatbots and conversational AI systems interacting in Polish.
  • Content creation and summarization tasks in Polish.
  • Use cases where grammatical accuracy and natural language flow in Polish are critical.