HiTZ/latxa-7b-v1

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jan 16, 2024License:llama2Architecture:Transformer0.0K Open Weights Cold

The HiTZ/latxa-7b-v1 is a 7 billion parameter language model developed by HiTZ Research Center & IXA Research group (University of the Basque Country UPV/EHU), based on Meta's LLaMA 2 architecture. This model is specifically fine-tuned for the Basque language, utilizing the Euscrawl corpus to significantly improve performance for this low-resource language. It is designed to overcome limitations of general LLMs for Basque, making it ideal for applications requiring high-quality Basque language processing.

Loading preview...

Latxa 7B: A Specialized LLM for Basque

Latxa 7B is a 7 billion parameter Large Language Model developed by the HiTZ Research Center & IXA Research group, specifically designed to address the limitations of general LLMs for low-resource languages like Basque. Built upon Meta's LLaMA 2 architecture, this model has undergone further training with Euscrawl, a highly curated Basque corpus, to achieve superior performance in Basque language tasks.

Key Capabilities & Features

  • Basque Language Specialization: Significantly outperforms general LLMs (like LLaMA 2 7B, BLOOM 7B, XGLM 7B) on various Basque-specific benchmarks, including reading comprehension, commonsense reasoning, sentiment analysis, and topic classification.
  • Foundation Model: Released as a pre-trained LLM, suitable for direct prompting or further fine-tuning for specific Basque-centric use cases.
  • Multilingual Support: Primarily focused on Basque (eu), with some English (en) data included during training to prevent catastrophic forgetting.
  • Reproducibility: This specific version (v1) is provided for reproducibility, with newer versions available in the Latxa Collection.

Ideal Use Cases

  • Developing applications and research for the Basque language.
  • Tasks requiring high accuracy in Basque text generation, understanding, and analysis.
  • Fine-tuning for specific downstream applications in Basque, such as chatbots, content creation, or information extraction.