botbotrobotics/CabraLlama3-8b

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kTool Calling:SupportedPublished:Apr 21, 2024License:cc-by-nc-2.0Architecture:Transformer0.0K Open Weights Cold

CabraLlama3-8b by botbotrobotics is an 8 billion parameter instruction-tuned Llama 3 model, specifically optimized for understanding and responding in Portuguese. It was refined using the internal Cabra 30k dataset, enhancing its performance on Portuguese language tasks. This model is designed for research purposes, focusing on generative model research and investigating model limitations and biases within a Portuguese context.

Loading preview...

What is CabraLlama3-8b?

CabraLlama3-8b is an 8 billion parameter instruction-tuned language model developed by botbotrobotics. It is a fine-tuned version of Meta's Llama-3-8B-Instruct, specifically enhanced for the Portuguese language using the proprietary Cabra 30k dataset. The base Llama 3 architecture is a self-regressive language model utilizing an optimized transformer architecture, with instruction-tuned versions employing supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) for improved utility and safety.

Key Capabilities & Optimizations

  • Portuguese Language Proficiency: Significantly optimized for understanding and generating responses in Portuguese, making it highly suitable for applications requiring strong performance in this language.
  • Instruction Following: Benefits from the Llama 3 base model's instruction-tuning, making it effective for dialogue-based use cases.
  • Research Focus: Primarily intended for research into generative models, including exploring their limitations and biases.

Performance Highlights

Evaluations on the Open Portuguese LLM Leaderboard show an average score of 69.42. Specific task performance includes:

  • ENEM Challenge (No Images): 74.67
  • Assin2 RTE: 90.44
  • Assin2 STS: 69.85
  • FaQuAD NLI: 70.38

Usage & Limitations

This model is currently restricted to research purposes only and is not permitted for commercial use. Quantized GGUF versions are available in the "quantanization" branch for various deployment needs.