orai-nlp/Llama-eus-8B

Warm
Public
8B
FP8
32768
Sep 4, 2024
Hugging Face
Overview

Llama-eus-8B: A Foundational LLM for Basque

Llama-eus-8B is an 8 billion parameter foundational large language model developed by Orai NLP Technologies, built upon Meta's Llama 3.1-8B. It addresses the limitations of general LLMs for low-resource languages by undergoing specialized continual pretraining focused on Basque. The model was trained on approximately 1.5 billion high-quality Basque tokens from the ZelaiHandi dataset and 300 million English tokens from the FineWeb dataset.

Key Capabilities & Differentiators

  • Enhanced Basque Linguistic Competence: Llama-eus-8B shows significant improvements in both formal (grammar, vocabulary) and functional (real-world usage) Basque linguistic competence compared to its base model, Meta-Llama-3.1-8B.
  • Bilingual Performance: It maintains strong general English capabilities with minimal degradation, effectively avoiding catastrophic forgetting during Basque-focused pretraining.
  • Competitive Benchmarking: In sub-10 billion parameter evaluations, Llama-eus-8B consistently outperforms Latxa-7b-v1.2 and Meta-Llama-3.1-8B across various Basque benchmarks, achieving an average score of 61.22. It also performs competitively against larger models like Latxa-13B and Latxa-70B in Basque tasks.
  • Efficient Training: The model was trained using the Hugging Face Transformers ecosystem with Accelerate and DeepSpeed ZeRO on 8x NVIDIA A100 GPUs, processing 7.2 billion tokens over 4 epochs.

Ideal Use Cases

  • Basque Language Applications: Developing applications requiring high linguistic accuracy and understanding in Basque.
  • Cross-Lingual Research: Projects focusing on low-resource language adaptation and cross-lingual transfer learning.
  • Resource-Constrained Environments: Utilizing a powerful Basque-optimized model without the computational overhead of much larger LLMs.