OpenLLM-Ro/RoLlama3.1-8b-Instruct-DPO
OpenLLM-Ro/RoLlama3.1-8b-Instruct-DPO is an 8 billion parameter instruction-tuned generative text model developed by OpenLLM-Ro, built with Meta Llama 3.1. This model is specifically fine-tuned for the Romanian language, excelling in human-aligned instruction following and chat tasks. It represents a dedicated open-source effort to provide powerful LLMs for Romanian, demonstrating strong performance on Romanian-specific benchmarks like MT-Bench and RoCulturaBench.
Loading preview...
Overview
OpenLLM-Ro/RoLlama3.1-8b-Instruct-DPO is an 8 billion parameter generative text model, part of the RoLlama3.1 family developed by OpenLLM-Ro. It is built upon Meta Llama 3.1 and specifically fine-tuned for the Romanian language, representing the first open-source initiative to create specialized LLMs for Romanian. This particular variant is a human-aligned instruct model, optimized for conversational and assistant-like interactions.
Key Capabilities
- Romanian Language Specialization: Developed and fine-tuned exclusively for Romanian, addressing a critical gap in open-source LLMs.
- Instruction Following: Designed for assistant-like chat and instruction-following tasks in Romanian.
- DPO Fine-tuning: Utilizes Direct Preference Optimization (DPO) with datasets like RoHelpSteer and RoUltraFeedback to enhance alignment and performance.
- Strong Romanian Benchmarks: Achieves an average score of 7.00 on MT-Bench and 4.73 on RoCulturaBench, outperforming other RoLlama3.1 variants and the base Llama-3.1-8B-Instruct on these Romanian-specific metrics.
Intended Use Cases
- Research in Romanian NLP: Ideal for academic and research purposes focused on the Romanian language.
- Assistant-like Chatbots: Suitable for developing conversational AI applications that require high proficiency in Romanian.
- Natural Language Tasks: Base models can be adapted for various Romanian natural language processing tasks.