MohamedRashad/Arabic-Orpo-Llama-3-8B-Instruct

Warm
Public
8B
FP8
8192
License: llama3
Hugging Face
Overview

What is Arabic-Orpo-Llama-3-8B-Instruct?

This model is a fine-tuned version of Meta's Llama-3-8B-Instruct, developed by MohamedRashad. It utilizes the ORPO (Optimized Reward Policy Optimization) method, applied to the 2A2I/argilla-dpo-mix-7k-arabic dataset, with the primary goal of enhancing its performance and alignment for the Arabic language.

Key Characteristics

  • Base Model: Meta-Llama-3-8B-Instruct (8 billion parameters, 8192 tokens context length).
  • Fine-tuning Method: ORPO, a technique aimed at better aligning language models.
  • Target Language: Specifically optimized for generating Arabic text.
  • Practical Performance: While formal evaluations on community|arabic_mmlu show a slight decrease in overall score (0.317 vs. 0.348 for the base Llama-3-8B-Instruct), the developer notes that in practice, this fine-tuned model produces more coherent and mostly correct Arabic text.

When to Use This Model

  • Arabic Text Generation: Ideal for applications requiring high-quality, coherent Arabic language output.
  • Experimentation with ORPO: Useful for researchers and developers interested in observing the practical effects of ORPO fine-tuning on language alignment, especially for non-English languages.
  • Arabic-focused LLM Applications: A strong candidate for chatbots, content generation, or other tasks where robust Arabic language understanding and generation are critical.