OpenLLM-Ro/RoLlama3.1-8b-Instruct-2024-10-09

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Oct 1, 2024License:cc-by-nc-4.0Architecture:Transformer Open Weights Cold

The OpenLLM-Ro/RoLlama3.1-8b-Instruct-2024-10-09 is an 8 billion parameter instruction-tuned generative text model developed by OpenLLM-Ro, built upon Meta Llama 3.1. This model is specifically designed and optimized for the Romanian language, offering specialized capabilities for natural language tasks in Romanian. It excels in assistant-like chat applications and is intended for research use within the Romanian linguistic context, demonstrating strong performance on Romanian-specific benchmarks.

Loading preview...

RoLlama3.1-8b-Instruct-2024-10-09: Romanian Language Model

OpenLLM-Ro/RoLlama3.1-8b-Instruct-2024-10-09 is an 8 billion parameter instruction-tuned model developed by OpenLLM-Ro, fine-tuned from Meta-Llama-3.1-8B-Instruct. It represents a significant open-source effort to create a powerful Large Language Model specifically for the Romanian language.

Key Capabilities and Features

  • Romanian Language Specialization: This model is explicitly designed and trained for Romanian, making it highly effective for tasks in this language.
  • Instruction-Tuned: Optimized for assistant-like chat applications, capable of following instructions and generating coherent responses.
  • Comprehensive Training Data: Fine-tuned using a diverse collection of Romanian instruction datasets, including RoAlpaca, RoDolly, RoSelfInstruct, and RoUltraChat.
  • Competitive Benchmarking: On academic benchmarks, RoLlama3.1-8b-Instruct-2024-10-09 shows strong performance, often surpassing or closely matching the base Llama-3.1-8B-Instruct on Romanian-specific tasks and general metrics like ARC, Winogrande, Hellaswag, and GSM8k. For instance, it achieves an average score of 53.03% compared to Llama-3.1-8B-Instruct's 49.87%.
  • Downstream Task Performance: Demonstrates robust performance on Romanian downstream tasks such as LaRoSeDa (Macro F1 of 87.53% for multiclass finetuned) and WMT (21.88 BLEU for EN-RO few-shot).

Intended Use Cases

This model is primarily intended for:

  • Research Use: Ideal for academic and research projects focused on Romanian natural language processing.
  • Assistant-like Chat: Suitable for developing conversational AI applications that require understanding and generating Romanian text.
  • Adaptation for NLP Tasks: The base models within the RoLlama3.1 family can be adapted for a variety of specific Romanian NLP tasks.