Canarim-7B-Instruct Overview
Canarim-7B-Instruct is a 7 billion parameter instruction-tuned language model developed by dominguesm. It is built upon the Canarim-7B base model and has been fine-tuned using a diverse set of publicly available instruction datasets. This training approach aims to enhance its ability to follow instructions and generate relevant responses across various prompts.
Key Capabilities & Performance
This model is designed for general instruction-following tasks, particularly in Portuguese. Its performance has been evaluated on the Open Portuguese LLM Leaderboard, where it achieved an average score of 47.21. Notable results include:
- Assin2 RTE: 75.74
- HateBR Binary: 79.57
- PT Hate Speech Binary: 64.01
- tweetSentBR: 66
These metrics suggest its proficiency in tasks like recognizing textual entailment and identifying hate speech or sentiment in Portuguese. The model supports a context length of 4096 tokens.
When to Use This Model
Canarim-7B-Instruct is a strong candidate for applications requiring instruction-following capabilities in Portuguese. Its fine-tuning on instruction datasets makes it suitable for tasks such as summarization, question answering, and text generation where clear instructions are provided. Developers can leverage its performance on specific Portuguese benchmarks for tasks related to natural language understanding and generation in that language.