VTSNLP/Llama3-ViettelSolutions-8B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Sep 26, 2024License:llama3Architecture:Transformer0.0K Warm

Llama3-ViettelSolutions-8B is an 8 billion parameter autoregressive transformer model developed by Viettel Solutions, based on Meta's Llama-3-8B architecture. It was continued pre-trained on a Vietnamese curated dataset and supervised fine-tuned on 5 million samples of Vietnamese instruct data. This model is designed for natural language processing tasks in both Vietnamese and English, excelling in instruction-following for Vietnamese language applications.

Loading preview...

Overview

Llama3-ViettelSolutions-8B is an 8 billion parameter autoregressive transformer model, a variant of the Meta Llama-3-8B. Developed by Viettel Solutions and funded by NVIDIA, this model has been specifically adapted for the Vietnamese language.

Key Capabilities

  • Multilingual Support: Processes both Vietnamese and English, with a strong focus on Vietnamese language tasks.
  • Instruction Following: Supervised fine-tuned on 5 million samples of Vietnamese instruct data, making it proficient in understanding and executing instructions.
  • Vietnamese Language Expertise: Continued pre-training on a dedicated Vietnamese curated dataset enhances its performance and understanding of the language.

Training Details

The model underwent a two-stage training process:

  1. Continued Pre-training: Utilized the Vietnamese curated dataset.
  2. Supervised Fine-tuning: Applied to 5 million samples from the Instruct general dataset.

Training was conducted using bf16 mixed precision with a data sequence length of 8192, on NVIDIA DGX infrastructure featuring 4 x A100 80GB GPUs, leveraging the NeMo Framework.

Good For

  • Applications requiring strong Vietnamese language understanding and generation.
  • Instruction-following tasks in Vietnamese.
  • Developers looking for a Llama 3-based model optimized for the Vietnamese linguistic context.