Name: HiTZ/latxa-7b-v1.1 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: HiTZ

Latxa 7B v1.1: A Specialized LLM for Basque

Latxa 7B v1.1 is a 7 billion parameter Large Language Model (LLM) developed by the HiTZ Research Center & IXA Research group, building upon Meta's Llama 2 architecture. This model addresses the performance gap for low-resource languages by undergoing extensive pre-training on a dedicated Basque corpus, comprising 4.3 million documents and 4.2 billion tokens. It significantly outperforms other open models in Basque language tasks and demonstrates competitive language proficiency with larger models like GPT-4 Turbo, particularly in understanding and generation.

Key Capabilities

Basque Language Specialization: Optimized for high performance in Basque, trained on a high-quality, carefully filtered corpus.
Llama 2 Foundation: Inherits the robust architecture and commercial-friendly Llama 2 license.
Reproducible Research: Released alongside its pre-training corpora and evaluation datasets to foster research in low-resource language LLMs.
Strong Benchmarking: Achieves an average score of 42.26% across various Basque evaluation benchmarks, including XStoryCloze, Belebele, BasqueGLUE, and EusProficiency, surpassing other 7B models like Mistral and Llama 2.

Good For

Basque Language Applications: Ideal for developing applications requiring deep understanding and generation in Basque.
Further Fine-tuning: Serves as a strong base for task-specific or instruction fine-tuning for various Basque use cases.
Research in Low-Resource NLP: Provides a valuable resource for researchers working on LLMs for languages with limited digital resources.

Overview

Latxa 7B v1.1: A Specialized LLM for Basque

Key Capabilities

Good For

Full Model Card (README)