OpenLLM-Ro/RoLlama2-7b-Instruct-DPO
OpenLLM-Ro/RoLlama2-7b-Instruct-DPO is a 7 billion parameter instruction-tuned language model developed by OpenLLM-Ro, specifically optimized for the Romanian language. This model is part of the RoLlama2 family, fine-tuned using Direct Preference Optimization (DPO) on Romanian datasets like RoHelpSteer and RoUltraFeedback. It excels in Romanian natural language tasks, particularly as an assistant-like chatbot, demonstrating superior performance on Romanian MT-Bench and RoCulturaBench compared to its predecessors and Llama-2-7b-chat.
Loading preview...
RoLlama2-7b-Instruct-DPO: Romanian LLM
OpenLLM-Ro/RoLlama2-7b-Instruct-DPO is a 7 billion parameter instruction-tuned model developed by OpenLLM-Ro, representing a significant open-source effort to create specialized LLMs for Romanian. This model is the human-aligned instruct variant within the RoLlama2 family, building upon the RoLlama2-7b-Instruct-2025-04-23 model.
Key Capabilities & Features
- Romanian Language Specialization: Developed and fine-tuned exclusively for the Romanian language, addressing a critical gap in open-source LLMs.
- DPO Fine-tuning: Utilizes Direct Preference Optimization (DPO) on a suite of Romanian datasets including RoHelpSteer, RoUltraFeedback, and others, enhancing its alignment and instruction-following abilities.
- Improved Performance: Demonstrates superior performance on Romanian-specific benchmarks such as Romanian MT-Bench and RoCulturaBench, significantly outperforming Llama-2-7b-chat and earlier RoLlama2 instruct models in Romanian language understanding and generation.
- Assistant-like Chat: Designed for assistant-like conversational tasks, providing helpful, respectful, and honest responses in Romanian.
Intended Use Cases
- Research: Ideal for academic research in Romanian natural language processing.
- Assistant-like Chatbots: Suited for developing conversational AI applications that require high proficiency in Romanian.
- Natural Language Tasks: Adaptable for various Romanian NLP tasks, leveraging its specialized training.
This model is licensed under cc-by-nc-4.0 and is intended for research use, with usage in other languages or in violation of the license being out-of-scope.