Nos-PT/Llama-Carvalho-PT-GL
Nos-PT/Llama-Carvalho-PT-GL is an 8-billion parameter transformer-based causal language model developed by Nos-PT, built upon meta-llama/Llama-3.1-8B. It is continually pretrained with a multilingual corpus emphasizing Galician and Portuguese, alongside Spanish and English. This model is specialized for text generation in these languages, particularly focusing on Galician and European Portuguese, and is suitable for fine-tuning for specific scenarios.
Loading preview...
Model Overview
Nos-PT/Llama-Carvalho-PT-GL is an 8-billion parameter causal language model, part of the Carvalho family of LLMs. It is a continually pretrained version of meta-llama/Llama-3.1-8B, specifically enhanced for Galician and Portuguese, while retaining knowledge of Spanish and English.
Key Capabilities
- Multilingual Proficiency: Specialized in Galician and Portuguese, with strong capabilities in Spanish and English.
- Causal Language Modeling: Ready-to-use for text generation tasks.
- Fine-tuning: Designed to be adaptable for specific downstream applications through fine-tuning.
Training Details
The model was trained using HuggingFace Transformers and PyTorch, leveraging DeepSpeed. The training corpus comprised 540M tokens of plain text and 72M tokens of instructions, with a significant emphasis on Galician (42.96% of base plain text, 37.01% of instructions) and Portuguese (46.29% of base plain text, 61.00% of instructions). Training was conducted on the MareNostrum V supercomputer at the Barcelona Supercomputing Center (BSC).
Performance
Evaluations on the Open Portuguese LLM Leaderboard show an average score of 60.06, with notable results in tasks like Assin2 RTE (89.30) and HateBR Binary (82.83).