OpenLLM-Ro/RoMistral-7b-Instruct-DPO: A Specialized Romanian LLM
OpenLLM-Ro/RoMistral-7b-Instruct-DPO is a 7 billion parameter instruction-tuned generative text model developed by OpenLLM-Ro, representing the first open-source effort to build a large language model specialized for Romanian. This model is a human-aligned instruct variant, fine-tuned from RoMistral-7b-Instruct-2025-04-23 using several Romanian DPO datasets including RoHelpSteer and RoUltraFeedback.
Key Capabilities & Performance
- Romanian Language Specialization: Developed specifically for Romanian, addressing the need for open-source LLMs in this language.
- Instruction Following: Designed for assistant-like chat and instruction-based tasks in Romanian.
- Strong Benchmarking: Achieves an average score of 56.62 on academic benchmarks, outperforming other RoMistral variants and Mistral-7B-Instruct-v0.2 on several metrics like ARC (55.51), MMLU (52.61), Hellaswag (64.97), and GSM8k (41.07).
- High MT-Bench Score: Records an MT-Bench average of 6.61, with perfect 160/160 answers in Romanian, indicating robust conversational abilities.
- RoCulturaBench Excellence: Scores 4.93 on RoCulturaBench, demonstrating strong cultural understanding within the Romanian context.
Intended Use Cases
- Research in Romanian NLP: Ideal for academic and research purposes focused on the Romanian language.
- Assistant-like Chatbots: Suitable for developing conversational AI applications that interact in Romanian.
- Natural Language Tasks: Can be adapted for various Romanian natural language processing tasks.