OpenLLM-Ro/RoGemma2-9b-Instruct-DPO
OpenLLM-Ro/RoGemma2-9b-Instruct-DPO is a 9 billion parameter instruction-tuned generative text model developed by OpenLLM-Ro, specifically designed for the Romanian language. This model is a human-aligned instruct variant, fine-tuned using Direct Preference Optimization (DPO) on various Romanian datasets. It excels in Romanian language tasks, offering strong performance in areas like machine translation (EN-RO) and semantic textual similarity (STS) in few-shot settings, making it suitable for research and assistant-like chat applications in Romanian.
Loading preview...
Overview
OpenLLM-Ro/RoGemma2-9b-Instruct-DPO is a 9 billion parameter generative text model developed by OpenLLM-Ro, part of the first open-source initiative to build LLMs specialized for Romanian. This particular model is an instruction-tuned variant, fine-tuned using Direct Preference Optimization (DPO) on a collection of Romanian datasets including RoHelpSteer, RoUltraFeedback, and RoMagpieDPO.
Key Capabilities
- Romanian Language Specialization: Developed specifically for Romanian, offering foundational and instruct variants.
- Human Alignment: Fine-tuned with DPO for improved human preferences and assistant-like chat.
- Strong Performance in Romanian Tasks: Achieves competitive results in few-shot machine translation (EN-RO BLEU: 28.16) and semantic textual similarity (STS Spearman: 73.24) compared to other models in its family.
- Academic Benchmarks: Demonstrates an average score of 59.79 on academic benchmarks, with notable performance in Winogrande (73.16) and Hellaswag (64.26).
Intended Use Cases
- Research: Ideal for research purposes in Romanian natural language processing.
- Assistant-like Chat: Instruction and chat-tuned models are designed for conversational AI applications.
- Natural Language Tasks: Base models can be adapted for a variety of Romanian NLP tasks.