Overview

Llama3-ViettelSolutions-8B is an 8 billion parameter autoregressive transformer model, a variant of the Meta Llama-3-8B. Developed by Viettel Solutions and funded by NVIDIA, this model has been specifically adapted for the Vietnamese language.

Key Capabilities

Multilingual Support: Processes both Vietnamese and English, with a strong focus on Vietnamese language tasks.
Instruction Following: Supervised fine-tuned on 5 million samples of Vietnamese instruct data, making it proficient in understanding and executing instructions.
Vietnamese Language Expertise: Continued pre-training on a dedicated Vietnamese curated dataset enhances its performance and understanding of the language.

Training Details

The model underwent a two-stage training process:

Continued Pre-training: Utilized the Vietnamese curated dataset.
Supervised Fine-tuning: Applied to 5 million samples from the Instruct general dataset.

Training was conducted using bf16 mixed precision with a data sequence length of 8192, on NVIDIA DGX infrastructure featuring 4 x A100 80GB GPUs, leveraging the NeMo Framework.

Good For

Applications requiring strong Vietnamese language understanding and generation.
Instruction-following tasks in Vietnamese.
Developers looking for a Llama 3-based model optimized for the Vietnamese linguistic context.

Overview

Overview

Key Capabilities

Training Details

Good For

Full Model Card (README)