dominguesm/canarim-7b

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Nov 16, 2023License:llama2Architecture:Transformer0.0K Open Weights Cold

Canarim-7B by Maicon Domingues is a 7 billion parameter Portuguese large language model, pretrained on 16 billion tokens from the Portuguese subset of CommonCrawl 2023-23, starting from LLaMA2-7B weights. Specialized in understanding and generating Portuguese text, it is optimized for few-shot tasks in Natural Language Understanding and Generation, making it ideal for applications targeting Portuguese-speaking audiences.

Loading preview...

Canarim-7B: A Portuguese-Specialized LLM

Canarim-7B is a 7 billion parameter large language model developed by Maicon Domingues, specifically designed for the Portuguese language. It was pretrained on 16 billion tokens from the Portuguese subset of CommonCrawl 2023-23, leveraging the robust LLaMA2-7B architecture as its foundation.

Key Capabilities

  • Portuguese Language Specialization: Optimized for understanding and generating text in Portuguese, making it highly suitable for applications targeting Portuguese-speaking users.
  • Robust Architecture: Inherits the efficient and reliable architecture of LLaMA2-7B.
  • Diverse Pretraining Data: Trained on a wide range of Portuguese text, enhancing its ability to handle various contexts and nuances.
  • Few-shot Learning: Best suited for tasks where a few examples of the desired outcome can be provided, rather than zero-shot scenarios.

Good For

  • Natural Language Understanding (NLU): Effective for tasks like sentiment analysis, topic classification, and entity recognition in Portuguese, especially with relevant examples.
  • Natural Language Generation (NLG): Capable of generating coherent and contextually appropriate Portuguese text for content creation or chatbots, with improved results when given style or format examples.
  • Language Translation: Suitable for high-quality translation involving Portuguese, particularly when examples are included during fine-tuning or inference.

Performance Highlights

Evaluations on the Open PT LLM Leaderboard show an average score of 47.36, with notable results in tasks like HateBR (78.48) and ASSIN2 RTE (71.96). On the Open LLM Leaderboard, it achieved an average of 48.63, including 77.52 on HellaSwag (10-Shot) and 71.43 on Winogrande (5-shot).