Name: Nos-PT/Llama-Carvalho-PT-GL API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Nos-PT

Model Overview

Nos-PT/Llama-Carvalho-PT-GL is an 8-billion parameter causal language model, part of the Carvalho family of LLMs. It is a continually pretrained version of meta-llama/Llama-3.1-8B, specifically enhanced for Galician and Portuguese, while retaining knowledge of Spanish and English.

Key Capabilities

Multilingual Proficiency: Specialized in Galician and Portuguese, with strong capabilities in Spanish and English.
Causal Language Modeling: Ready-to-use for text generation tasks.
Fine-tuning: Designed to be adaptable for specific downstream applications through fine-tuning.

Training Details

The model was trained using HuggingFace Transformers and PyTorch, leveraging DeepSpeed. The training corpus comprised 540M tokens of plain text and 72M tokens of instructions, with a significant emphasis on Galician (42.96% of base plain text, 37.01% of instructions) and Portuguese (46.29% of base plain text, 61.00% of instructions). Training was conducted on the MareNostrum V supercomputer at the Barcelona Supercomputing Center (BSC).

Performance

Evaluations on the Open Portuguese LLM Leaderboard show an average score of 60.06, with notable results in tasks like Assin2 RTE (89.30) and HateBR Binary (82.83).

Overview

Model Overview

Key Capabilities

Training Details

Performance

Full Model Card (README)