HiTZ/Latxa-Llama-3.1-8B-Instruct

Warm
Public
8B
FP8
32768
Feb 27, 2025
License: other
Hugging Face
Overview

Latxa 3.1 8B Instruct: Basque Language Adaptation

Latxa 3.1 8B Instruct is an instruction-tuned large language model developed by the HiTZ Research Center & IXA Research group, building upon Meta's Llama-3.1-8B-Instruct. This model addresses the performance gap for low-resource languages by undergoing extensive language adaptation using a 4.2 billion token Basque corpus (Etxaniz et al., 2024).

Key Capabilities

  • Basque Language Proficiency: Demonstrates substantial performance improvements over Llama-3.1-Instruct on standard Basque benchmarks, particularly in chat conversations.
  • Instruction Following: Designed to follow instructions and function effectively as a chat assistant in Basque.
  • Competitive Performance: Preliminary evaluations, including an arena-based assessment, show Latxa 3.1 8B Instruct ranking highly against other models, including proprietary ones like GPT-4o and Claude Sonnet, for Basque tasks.

Evaluation Highlights

Latxa 3.1 8B Instruct shows significant gains across various Basque datasets compared to Llama-3.1 8B Instruct:

  • Belebele: 80.00% accuracy (vs. 73.89% for Llama-3.1 8B Instruct)
  • X-Story Cloze: 71.34% accuracy (vs. 61.22%)
  • EusProficiency: 52.83% accuracy (vs. 34.13%)
  • EusReading: 62.78% accuracy (vs. 49.72%)
  • EusTrivia: 61.05% accuracy (vs. 45.01%)
  • EusExams: 56.00% accuracy (vs. 46.21%)

Good for

  • Applications requiring high-quality language generation and understanding in Basque.
  • Developing chatbots and conversational AI systems for Basque speakers.
  • Research and development in low-resource language NLP, specifically for Basque.