Name: irlab-udc/Llama-3.1-8B-Instruct-Galician API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: irlab-udc

Model Overview

irlab-udc/Llama-3.1-8B-Instruct-Galician, or Cabuxa 2.0, is an 8 billion parameter instruction-tuned language model developed by the UDC Information Retrieval Lab (IRLab). It is built upon Meta's Llama-3.1-8B-Instruct and has undergone continued pretraining using the CorpusNós dataset, specifically to enhance its capabilities in the Galician language.

Key Capabilities

Galician Language Adaptation: The model is specifically fine-tuned for natural language processing in Galician, addressing the underrepresentation of minority languages in LLMs.
Instruction Following: Inherits instruction-following capabilities from its Llama-3.1-8B-Instruct base, adapted for Galician-specific prompts.
Performance: In evaluations, this model has shown to outperform both the base Llama-3.1 model and another Galician model in quantitative and qualitative terms for Galician NLP tasks.

Training Details

The model was trained with a learning rate of 0.0001, a batch size of 32, and for 1.0 epoch. The training utilized 4 NVIDIA A100 SXM4 80 GB GPUs for 60 hours, resulting in an estimated carbon emission of 10.37 Kg. CO₂ eq.

Use Cases

This model is ideal for applications requiring robust language understanding and generation in Galician, such as:

Conversational AI systems in Galician.
Text generation and summarization for Galician content.
Research and development in NLP for underrepresented languages.

Overview

Model Overview

Key Capabilities

Training Details

Use Cases

Full Model Card (README)