aisingapore/Llama-SEA-LION-v3-8B
Llama-SEA-LION-v3-8B is an 8 billion parameter multilingual decoder model developed by AI Singapore, built upon the Llama 3.1 architecture. It has undergone continued pre-training on approximately 200 billion tokens across 11 Southeast Asian languages, including Burmese, Chinese, English, Filipino, Indonesian, Khmer, Lao, Malay, Tamil, Thai, and Vietnamese. This model is specifically designed and optimized for understanding and generating content in the diverse linguistic landscape of Southeast Asia, making it highly effective for regional applications. It supports a context length of 32768 tokens, enhancing its ability to handle extensive multilingual inputs.
Loading preview...
Overview
Llama-SEA-LION-v3-8B is an 8 billion parameter Large Language Model (LLM) developed by AI Singapore, specifically designed for the Southeast Asian (SEA) region. It is built on the Llama 3.1-8B-Instruct architecture and has undergone extensive continued pre-training on approximately 200 billion tokens across 11 SEA languages: Burmese, Chinese, English, Filipino, Indonesian, Khmer, Lao, Malay, Tamil, Thai, and Vietnamese. The model's name, SEA-LION, stands for "Southeast Asian Languages In One Network," reflecting its core focus.
Key Capabilities
- Multilingual Proficiency: Excels in 11 Southeast Asian languages, making it suitable for regional applications and cross-lingual tasks.
- Continued Pre-training: Enhanced understanding of SEA linguistic nuances through dedicated pre-training on a massive 200B token dataset, including specific SEA-LION Pile datasets.
- Benchmark Performance: Evaluated using the SEA-HELM evaluation benchmark for general language capabilities (QA, Sentiment, Toxicity, Translation, Summarisation, Reasoning, NLI) and SEA-IFEval for constraint-following behavior in both English and SEA languages.
- Llama 3.1 Foundation: Leverages the robust Llama 3.1 architecture and its default tokenizer.
Good for
- Applications requiring strong performance in multiple Southeast Asian languages.
- Research and development focusing on multilingual NLP tasks in the SEA region.
- Use cases demanding adherence to specific output constraints, as evaluated by SEA-IFEval.
Limitations
- The model has not been aligned for safety, and developers are advised to perform their own safety fine-tuning and implement related security measures.