OpenLLM-Ro/RoLlama3-8b-Instruct-DPO
OpenLLM-Ro/RoLlama3-8b-Instruct-DPO is an 8 billion parameter instruction-tuned generative text model developed by OpenLLM-Ro, built with Meta Llama 3 and specialized for the Romanian language. This model is the human-aligned instruct variant, fine-tuned using multiple Romanian DPO datasets. It excels in Romanian-specific benchmarks, demonstrating strong performance in tasks like RoCulturaBench and Romanian MT-Bench, making it ideal for assistant-like chat and research in Romanian natural language processing.
Loading preview...
OpenLLM-Ro/RoLlama3-8b-Instruct-DPO: Romanian-Specialized Llama 3
OpenLLM-Ro/RoLlama3-8b-Instruct-DPO is an 8 billion parameter instruction-tuned generative text model developed by OpenLLM-Ro, specifically designed for the Romanian language. Built upon the Meta Llama 3 architecture, this model represents a significant open-source effort to provide high-quality LLMs tailored for Romanian NLP tasks. It is the human-aligned instruct variant within the RoLlama3 family, fine-tuned using a collection of Romanian DPO datasets including RoHelpSteer, RoUltraFeedback, and RoMagpieDPO.
Key Capabilities
- Romanian Language Specialization: Optimized for understanding and generating text exclusively in Romanian.
- Instruction Following: Fine-tuned for assistant-like chat and responding to instructions effectively.
- Human Alignment: Utilizes Direct Preference Optimization (DPO) on diverse Romanian datasets to enhance human alignment.
- Strong Benchmark Performance: Achieves an average score of 55.86 on academic benchmarks (ARC, MMLU, Winogrande, Hellaswag, GSM8k, TruthfulQA) and 6.67 on Romanian MT-Bench, outperforming other RoLlama3 variants and Llama-3-8B-Instruct in several key metrics, including a 57.06 on TruthfulQA and 4.83 on RoCulturaBench.
Good for
- Research in Romanian natural language processing.
- Developing AI assistants and chatbots for Romanian users.
- Applications requiring high-quality text generation and instruction following in Romanian.
- Exploring human-aligned LLM capabilities for a specific language.