dominguesm/Canarim-7B-Instruct
dominguesm/Canarim-7B-Instruct is a 7 billion parameter instruction-tuned language model developed by dominguesm, initialized from the Canarim-7B base model. It is specifically trained on publicly available instruction datasets, making it suitable for general instruction-following tasks. With a 4096-token context length, this model demonstrates capabilities in Portuguese language understanding and generation, as evidenced by its performance on the Open Portuguese LLM Leaderboard.
Loading preview...
Canarim-7B-Instruct Overview
Canarim-7B-Instruct is a 7 billion parameter instruction-tuned language model developed by dominguesm. It is built upon the Canarim-7B base model and has been fine-tuned using a diverse set of publicly available instruction datasets. This training approach aims to enhance its ability to follow instructions and generate relevant responses across various prompts.
Key Capabilities & Performance
This model is designed for general instruction-following tasks, particularly in Portuguese. Its performance has been evaluated on the Open Portuguese LLM Leaderboard, where it achieved an average score of 47.21. Notable results include:
- Assin2 RTE: 75.74
- HateBR Binary: 79.57
- PT Hate Speech Binary: 64.01
- tweetSentBR: 66
These metrics suggest its proficiency in tasks like recognizing textual entailment and identifying hate speech or sentiment in Portuguese. The model supports a context length of 4096 tokens.
When to Use This Model
Canarim-7B-Instruct is a strong candidate for applications requiring instruction-following capabilities in Portuguese. Its fine-tuning on instruction datasets makes it suitable for tasks such as summarization, question answering, and text generation where clear instructions are provided. Developers can leverage its performance on specific Portuguese benchmarks for tasks related to natural language understanding and generation in that language.