Llama-3-MAAL-8B-Instruct-v0.1: Multilingual Adaptive Augmentation Language Model

Developed by maum.ai Brain NLP, Llama-3-MAAL-8B-Instruct-v0.1 is an 8 billion parameter instruction-tuned model based on Llama-3, featuring a 8192 token context length. This model introduces a Multilingual Adaptive Augmentation Language-model (MAAL) approach, focusing on transferring instruction-following capabilities from English to Korean through cross-lingual training.

Key Capabilities & Features

Bilingual Instruction Following: Specifically trained to understand and respond to instructions in both Korean and English.
Cross-lingual Transfer: Utilizes cross-lingual training to efficiently transfer instruction-following skills from English to Korean without extensive continuous pre-training.
Optimized for Korean Logic: Evaluated using the LogicKor benchmark, demonstrating competitive performance against other Korean-fine-tuned models in single-turn and multi-turn Korean instruction tasks.
Efficient Training: Trained on 8 H100-80G GPUs for one day, highlighting an efficient development process for its bilingual capabilities.

Limitations & Future Development

Currently, the model has limitations including difficulty generating diverse Korean texts and a lack of deep Korean knowledge and cultural localization due to its training on a relatively small dataset. Future plans include enhancing Korean generation through vocabulary expansion and continuous pre-training, localizing with cultural adaptation methods, and developing a Vision Language Model variant.