Llama-3-MAAL-8B-Instruct-v0.1: Multilingual Adaptive Augmentation Language Model
Developed by maum.ai Brain NLP, Llama-3-MAAL-8B-Instruct-v0.1 is an 8 billion parameter instruction-tuned model based on Llama-3, featuring a 8192 token context length. This model introduces a Multilingual Adaptive Augmentation Language-model (MAAL) approach, focusing on transferring instruction-following capabilities from English to Korean through cross-lingual training.
Key Capabilities & Features
- Bilingual Instruction Following: Specifically trained to understand and respond to instructions in both Korean and English.
- Cross-lingual Transfer: Utilizes cross-lingual training to efficiently transfer instruction-following skills from English to Korean without extensive continuous pre-training.
- Optimized for Korean Logic: Evaluated using the LogicKor benchmark, demonstrating competitive performance against other Korean-fine-tuned models in single-turn and multi-turn Korean instruction tasks.
- Efficient Training: Trained on 8 H100-80G GPUs for one day, highlighting an efficient development process for its bilingual capabilities.
Limitations & Future Development
Currently, the model has limitations including difficulty generating diverse Korean texts and a lack of deep Korean knowledge and cultural localization due to its training on a relatively small dataset. Future plans include enhancing Korean generation through vocabulary expansion and continuous pre-training, localizing with cultural adaptation methods, and developing a Vision Language Model variant.