Name: proxectonos/Llama-3.1-Carballo API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: proxectonos

Llama-3.1-Carballo: Multilingual LLM for Galician and Romance Languages

Llama-3.1-Carballo is an 8-billion parameter causal language model developed by proxectonos, built upon the Meta Llama-3.1-8B architecture. It has undergone extensive continual pretraining on a nearly 20-billion token multilingual corpus, with a significant focus on Galician texts.

Key Capabilities

Multilingual Proficiency: Specialized in Galician, Portuguese, Spanish, Catalan, and English, with a particular strength in Galician.
Text Generation: Ready-to-use for various text generation tasks.
Fine-tuning Base: Suitable as a base model for further fine-tuning on specific downstream applications.

Training Details

The model was trained using HuggingFace Transformers and PyTorch, leveraging DeepSpeed for efficiency. The training corpus included 5 billion Galician tokens (from CorpusNós), 3 billion Portuguese, 3.5 billion Spanish, 3.4 billion English, and 3.6 billion Catalan tokens (from CATalog). Training was conducted at the Galicia Supercomputing Center (CESGA).

Intended Use

Llama-3.1-Carballo is designed for causal language modeling and can be used for tasks such as translation, question answering, sentiment analysis, and named entity recognition in its supported languages. It is particularly valuable for applications requiring high performance in Galician.

Overview

Llama-3.1-Carballo: Multilingual LLM for Galician and Romance Languages

Key Capabilities

Training Details

Intended Use

Full Model Card (README)