Overview

Llama-SEA-LION-v3-8B is an 8 billion parameter Large Language Model (LLM) developed by AI Singapore, specifically designed for the Southeast Asian (SEA) region. It is built on the Llama 3.1-8B-Instruct architecture and has undergone extensive continued pre-training on approximately 200 billion tokens across 11 SEA languages: Burmese, Chinese, English, Filipino, Indonesian, Khmer, Lao, Malay, Tamil, Thai, and Vietnamese. The model's name, SEA-LION, stands for "Southeast Asian Languages In One Network," reflecting its core focus.

Key Capabilities

Multilingual Proficiency: Excels in 11 Southeast Asian languages, making it suitable for regional applications and cross-lingual tasks.
Continued Pre-training: Enhanced understanding of SEA linguistic nuances through dedicated pre-training on a massive 200B token dataset, including specific SEA-LION Pile datasets.
Benchmark Performance: Evaluated using the SEA-HELM evaluation benchmark for general language capabilities (QA, Sentiment, Toxicity, Translation, Summarisation, Reasoning, NLI) and SEA-IFEval for constraint-following behavior in both English and SEA languages.
Llama 3.1 Foundation: Leverages the robust Llama 3.1 architecture and its default tokenizer.

Good for

Applications requiring strong performance in multiple Southeast Asian languages.
Research and development focusing on multilingual NLP tasks in the SEA region.
Use cases demanding adherence to specific output constraints, as evaluated by SEA-IFEval.

Limitations

The model has not been aligned for safety, and developers are advised to perform their own safety fine-tuning and implement related security measures.

Overview

Overview

Key Capabilities

Good for

Limitations

Full Model Card (README)