Gervásio 7B PTPT Decoder: European Portuguese LLM

This model, developed by NLX-Natural Language and Speech Group at the University of Lisbon, is a 7 billion parameter decoder-only language model specifically fine-tuned for European Portuguese (pt-PT). It is built upon the LLaMA-2 architecture and features a 4096-token context length.

Key Capabilities

Portuguese Language Generation: Optimized for generating text in European Portuguese.
Instruction Following: Fine-tuned using instruction datasets, including extraGLUE-Instruct, which are machine-translated and augmented versions of GLUE and SuperGLUE tasks.
Reasoning and Inference: Performance benchmarks show significant improvement over base LLaMA-2 models in tasks like MRPC (paraphrase detection), RTE (textual entailment), and COPA (reasoning).

Training Details

Training involved supervised fine-tuning with a causal language modeling objective. Datasets were machine-translated into European Portuguese and augmented, with manually crafted instruction templates. The model was trained with a learning rate of 2e-5 and a sequence length of 512 tokens.

Important Note

This specific model version has been deprecated. Users are recommended to use the improved gervasio-8b-portuguese-ptpt-decoder for better performance and continued support.

Overview

Gervásio 7B PTPT Decoder: European Portuguese LLM

Key Capabilities

Training Details

Important Note

Full Model Card (README)