orai-nlp/Llama-eus-8B
Llama-eus-8B is an 8 billion parameter foundational large language model developed by Orai NLP Technologies, adapted from Meta's Llama 3.1. It is specifically tailored for the Basque language through continual pretraining on 1.5 billion high-quality Basque tokens from the ZelaiHandi dataset, alongside a subset of FineWeb. This model significantly enhances linguistic performance in Basque, demonstrating notable improvements in formal and functional linguistic competence while largely retaining its general English capabilities. With a 32768 token context length, it is optimized for natural language understanding and instruction following in low-resource languages like Basque.
Loading preview...
Llama-eus-8B: A Foundational LLM for Basque
Llama-eus-8B is an 8 billion parameter foundational large language model developed by Orai NLP Technologies, built upon Meta's Llama 3.1-8B. It addresses the limitations of general LLMs for low-resource languages by undergoing specialized continual pretraining focused on Basque. The model was trained on approximately 1.5 billion high-quality Basque tokens from the ZelaiHandi dataset and 300 million English tokens from the FineWeb dataset.
Key Capabilities & Differentiators
- Enhanced Basque Linguistic Competence: Llama-eus-8B shows significant improvements in both formal (grammar, vocabulary) and functional (real-world usage) Basque linguistic competence compared to its base model, Meta-Llama-3.1-8B.
- Bilingual Performance: It maintains strong general English capabilities with minimal degradation, effectively avoiding catastrophic forgetting during Basque-focused pretraining.
- Competitive Benchmarking: In sub-10 billion parameter evaluations, Llama-eus-8B consistently outperforms Latxa-7b-v1.2 and Meta-Llama-3.1-8B across various Basque benchmarks, achieving an average score of 61.22. It also performs competitively against larger models like Latxa-13B and Latxa-70B in Basque tasks.
- Efficient Training: The model was trained using the Hugging Face Transformers ecosystem with Accelerate and DeepSpeed ZeRO on 8x NVIDIA A100 GPUs, processing 7.2 billion tokens over 4 epochs.
Ideal Use Cases
- Basque Language Applications: Developing applications requiring high linguistic accuracy and understanding in Basque.
- Cross-Lingual Research: Projects focusing on low-resource language adaptation and cross-lingual transfer learning.
- Resource-Constrained Environments: Utilizing a powerful Basque-optimized model without the computational overhead of much larger LLMs.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.