Gervásio 70B PTPT: A Portuguese-Optimized Decoder Model
Gervásio 70B PTPT is a 70 billion parameter decoder-only language model, developed by the NLX-Natural Language and Speech Group at the University of Lisbon. It is built upon the LLaMA-3.3 70B instruct model architecture and has been extensively fine-tuned for the Portuguese language. The model's training incorporated a diverse set of European Portuguese datasets, including instruction data like extraGLUE-Instruct, NatInst-PTPT, MMLU-PTPT, and Wiki-PTPT, alongside translated and native datasets such as MMLU, Natural Language Instructions, GLUE tasks (MRPC, RTE, STS-B, WNLI), SuperGLUE tasks (BoolQ, CB, COPA, MultiRC), and a human-curated subset of Portuguese Wikipedia.
Key Capabilities
- Portuguese Language Specialization: Optimized for understanding and generating text in European Portuguese.
- Instruction Following: Enhanced through training on specific instruction datasets.
- Reasoning and Question Answering: Demonstrates strong performance on tasks like COPA (reasoning) and MMLU (question answering) in Portuguese.
- Textual Inference: Achieves high F1 scores on RTE (recognizing textual entailment).
- Comparison with LLaMA-3.3 70B Instruct (English): Outperforms its English counterpart on several Portuguese-specific benchmarks, including MRPC, RTE, COPA, MMLU, and Tuguesice-PT.
Good for
- Portuguese NLP Applications: Ideal for tasks requiring high-quality text generation and understanding in Portuguese.
- Research and Development: Provides a robust base for further research into large language models for less-resourced languages.
- Chatbots and Conversational AI: Integrated into the Evaristo.ai chatbot for generative capabilities.