HiTZ/latxa-70b-v1.2

TEXT GENERATIONConcurrency Cost:4Model Size:69BQuant:FP8Ctx Length:32kPublished:Jun 11, 2024License:llama2Architecture:Transformer Open Weights Cold

Latxa-70b-v1.2 is a 69 billion parameter large language model developed by HiTZ Research Center & IXA Research group, based on Meta's Llama 2 architecture. It is specifically pretrained on a 4.2 billion token Basque corpus, making it highly proficient in the Basque language. This model excels in Basque language understanding and generation, outperforming other open models in various Basque-specific benchmarks.

Loading preview...

Latxa-70b-v1.2: A Specialized LLM for Basque

Latxa-70b-v1.2 is a 69 billion parameter large language model developed by the HiTZ Research Center & IXA Research group, building upon Meta's Llama 2 architecture. This model was specifically designed to address the performance gap for low-resource languages like Basque in the LLM landscape. It underwent continued pretraining on the high-quality Latxa Corpus v1.1, comprising 4.3 million documents and 4.2 billion tokens of Basque data, with an additional 500K English documents from the Pile to prevent catastrophic forgetting.

Key Capabilities

  • Exceptional Basque Language Proficiency: Latxa-70b-v1.2 significantly outperforms previous open models in Basque language tasks, demonstrating strong understanding and generation capabilities.
  • Competitive with GPT-4 Turbo: Achieves competitive results with GPT-4 Turbo in Basque language proficiency and understanding, though it lags in reading comprehension and knowledge-intensive tasks.
  • Pre-trained LLM: Functions as a pre-trained language model, suitable for direct prompting or further fine-tuning for specific Basque-centric applications.

Good for

  • Basque Language Applications: Ideal for any use case requiring high performance in the Basque language, including text generation, analysis, and understanding.
  • Research and Development in Low-Resource Languages: Promotes research and technological development for the Basque language and offers insights for other low-resource languages.

It is important to note that Latxa models are not instruction-tuned or designed as chat assistants, and their performance is not guaranteed for languages other than Basque.