Model Overview
MLP-KTLim/llama-3.1-Asian-Bllossom-8B-Translator is an 8 billion parameter multilingual translation model built upon the LLaMA 3.1 Instruct base. Developed by MLP-KTLim, this model is specifically fine-tuned for mutual translation among five key Southeast Asian languages: Korean, Vietnamese, Indonesian, Cambodian (Khmer), and Thai. It is designed to facilitate communication within these language communities.
Key Capabilities
- Multilingual Translation: Supports bidirectional translation across all pairs of Korean, Vietnamese, Indonesian, Cambodian, and Thai.
- Optimized for Short Texts: Best suited for translating short sentences, phrases, and basic conversational snippets.
- Extensive Training: Trained on a substantial dataset of 20 million examples, with 1 million examples for each translation direction, ensuring robust performance for common expressions.
Performance Highlights
The model demonstrates strong performance across various language pairs, with BLEU scores ranging from 45.59 to 78.84. For instance, Indonesian to Cambodian translation achieves a BLEU score of 78.84, while Korean to Cambodian reaches 71.69. ROUGE-1 and ROUGE-L scores also indicate high accuracy in content and linguistic quality.
Limitations
- Short Text Focus: Performance may degrade with complex or lengthy texts.
- Context Sensitivity: Translation quality can vary based on the specific language pair and the complexity of the content.
Good for
- Applications requiring translation between Korean, Vietnamese, Indonesian, Cambodian, and Thai.
- Translating short messages, user inputs, or conversational elements in these specific languages.