lordjia/Llama-3-Cantonese-8B-Instruct

Warm
Public
8B
FP8
8192
License: llama3
Hugging Face
Overview

Model Overview

Llama-3-Cantonese-8B-Instruct is an 8 billion parameter language model developed by lordjia, built upon the Meta-Llama-3-8B-Instruct architecture. It has been fine-tuned using the LoRA method over 4562 steps, specifically to improve its capabilities in the Cantonese language.

Key Capabilities

  • Enhanced Cantonese Processing: Designed to boost Cantonese text generation and understanding.
  • Versatile Task Support: Capable of handling various natural language tasks, including dialogue generation, text summarization, and question-answering in Cantonese.
  • Specialized Training Data: Fine-tuned on dedicated Cantonese datasets, including jed351/cantonese-wikipedia and lordjia/Cantonese_English_Translation, to ensure high linguistic relevance.
  • Quantized Version Available: A 4-bit quantized version (llama3-cantonese-8b-instruct-q4_0.gguf) is provided for more efficient inference and deployment.

Performance Insights

Evaluations on the Open LLM Leaderboard show an average score of 24.16. Specific metrics include 66.69 for IFEval (0-Shot), 26.79 for BBH (3-Shot), and 27.94 for MMLU-PRO (5-shot).

Good For

  • Developers and researchers focusing on Cantonese language applications.
  • Projects requiring robust Cantonese dialogue, summarization, or Q&A functionalities.
  • Use cases where a specialized, instruction-tuned Cantonese model is preferred over general-purpose LLMs.