Nos-PT/Llama-Carvalho-PT

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Jan 30, 2025License:llama3.1Architecture:Transformer Cold

Llama-Carvalho-PT is an 8 billion parameter transformer-based causal language model developed by Nos-PT. It is a continually pre-trained version of Meta Llama-3.1-8B, specialized for Galician, Portuguese, Spanish, and English, with a strong emphasis on Portuguese. This model excels at text generation tasks in these languages and is suitable for fine-tuning in specific multilingual scenarios, leveraging its 32768 token context length.

Loading preview...

Overview

Llama-Carvalho-PT is an 8 billion parameter causal language model, part of the Carvalho family of LLMs, developed by Nos-PT. It is built upon Meta's Llama-3.1-8B, undergoing continual pre-training with a 340 million token multilingual corpus, specifically emphasizing Portuguese. This specialization aims to enhance its performance in Portuguese and Galician while maintaining proficiency in Spanish and English.

Key Capabilities

  • Multilingual Text Generation: Proficient in generating text across Galician, Portuguese, Spanish, and English, with a focus on Portuguese and Galician nuances.
  • Continual Pre-training: Benefits from additional training data to adapt Llama-3.1-8B for specific linguistic varieties, including European Portuguese.
  • Causal Language Modeling: Ready for direct use in text generation tasks and serves as a strong base for further fine-tuning.
  • Evaluation on Portuguese Leaderboard: Achieves an average score of 54.06 on the Open Portuguese LLM Leaderboard, with notable scores in tasks like Assin2 RTE (87.50) and HateBR Binary (76.93).

Intended Use Cases

  • Text Generation: Ideal for applications requiring text creation in Portuguese, Galician, Spanish, and English.
  • Fine-tuning: Suitable as a base model for domain-specific or task-specific fine-tuning, particularly for applications targeting the mentioned languages.
  • Research and Development: Useful for researchers exploring multilingual LLMs and language adaptation strategies, especially for underrepresented language varieties.