Overview
Llama3-ViettelSolutions-8B is an 8 billion parameter autoregressive transformer model, a variant of the Meta Llama-3-8B. Developed by Viettel Solutions and funded by NVIDIA, this model has been specifically adapted for the Vietnamese language.
Key Capabilities
- Multilingual Support: Processes both Vietnamese and English, with a strong focus on Vietnamese language tasks.
- Instruction Following: Supervised fine-tuned on 5 million samples of Vietnamese instruct data, making it proficient in understanding and executing instructions.
- Vietnamese Language Expertise: Continued pre-training on a dedicated Vietnamese curated dataset enhances its performance and understanding of the language.
Training Details
The model underwent a two-stage training process:
- Continued Pre-training: Utilized the Vietnamese curated dataset.
- Supervised Fine-tuning: Applied to 5 million samples from the Instruct general dataset.
Training was conducted using bf16 mixed precision with a data sequence length of 8192, on NVIDIA DGX infrastructure featuring 4 x A100 80GB GPUs, leveraging the NeMo Framework.
Good For
- Applications requiring strong Vietnamese language understanding and generation.
- Instruction-following tasks in Vietnamese.
- Developers looking for a Llama 3-based model optimized for the Vietnamese linguistic context.