rhaymison/phi-3-portuguese-tom-cat-4k-instruct

Warm
Public
4B
BF16
4096
Apr 29, 2024
License: apache-2.0
Hugging Face
Overview

Overview

The rhaymison/phi-3-portuguese-tom-cat-4k-instruct is a 4 billion parameter instruction-tuned model, derived from the microsoft/Phi-3-mini-4k architecture. Its primary focus is to enhance the availability and performance of large language models in Portuguese. The model was specifically trained using a comprehensive dataset of 300,000 Portuguese instructions, aiming to fill a significant gap in the LLM landscape for this language.

Key Capabilities

  • Portuguese Language Proficiency: Optimized for understanding and generating text in Portuguese, making it suitable for applications requiring strong linguistic capabilities in this language.
  • Instruction Following: Fine-tuned with a large instruction set, enabling it to follow complex commands and generate relevant responses.
  • Context Length: Supports a 4096-token context window, allowing for processing and generating longer texts while maintaining coherence.
  • Quantization Support: Compatible with 4-bit and 8-bit quantization, facilitating deployment on less powerful hardware like T4 or L4 GPUs, while the full model requires an A100.
  • GGUF Compatibility: Available in GGUF formats (e.g., rhaymison/phi-3-portuguese-tom-cat-4k-instruct-q8-gguf), enabling execution with LlamaCpp for enhanced compatibility and local deployment.

Performance

Evaluated on the Open Portuguese LLM Leaderboard, the model achieved an average score of 64.57. Notable results include:

  • Assin2 RTE: 91.54
  • Assin2 STS: 75.27
  • PT Hate Speech Binary: 70.19

Good For

  • Applications requiring a robust instruction-following model specifically for the Portuguese language.
  • Developers looking for a model that can be efficiently deployed on various hardware configurations, including those with limited memory, through quantization.
  • Use cases where a 4096-token context window is sufficient for processing user queries and generating responses.