Latxa-Llama-3.1-70B-Instruct Overview
Latxa-Llama-3.1-70B-Instruct is a 70 billion parameter instruction-tuned large language model developed by the HiTZ Research Center & IXA Research group. It is built upon Meta's Llama-3.1-Instruct and has undergone further training using language adaptation techniques on a substantial Basque corpus (4.3 million documents, 4.2 billion tokens).
Key Capabilities & Differentiators
- Basque Language Specialization: Specifically designed and optimized for the Basque language, addressing the performance gap observed in low-resource languages with general-purpose LLMs.
- Superior Basque Performance: Preliminary evaluations show it significantly outperforms the base Llama-3.1-Instruct model on standard Basque benchmarks and in chat conversations.
- Instruction Following: Trained to follow instructions and function as a chat assistant, making it suitable for interactive applications.
- Competitive Arena Performance: Achieved 3rd place in a public arena-based evaluation against models like GPT-4o and Claude Sonnet, outperforming other same-size competitors.
Intended Use Cases
- Basque Language Applications: Ideal for any application requiring high-quality natural language processing or generation in Basque.
- Instruction Following & Chatbots: Suitable for building instruction-following agents or conversational AI systems in Basque.
Limitations
- Performance is not guaranteed for languages other than Basque.
- Inherits potential biases and limitations from its parent Llama 3.1 models.