aisingapore/Llama-SEA-LION-v2-8B

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kTool Calling:SupportedPublished:Jul 30, 2024License:llama3Architecture:Transformer0.0K Cold

Llama-SEA-LION-v2-8B is an 8 billion parameter decoder-only large language model developed by AI Singapore, based on the Llama 3 architecture. It has undergone continued pre-training on approximately 48 billion tokens across five Southeast Asian languages: English, Indonesian, Tamil, Thai, and Vietnamese. This model is specifically optimized for general language capabilities within the Southeast Asian region, making it suitable for multilingual applications targeting these languages. It leverages the default Llama 3 8B Instruct tokenizer and supports an 8192 token context length.

Loading preview...

Llama-SEA-LION-v2-8B: Southeast Asian Multilingual LLM

Llama-SEA-LION-v2-8B is an 8 billion parameter large language model developed by AI Singapore, specifically designed for the Southeast Asian (SEA) region. Built upon the Meta-Llama-3-8B-Instruct architecture, this model has undergone extensive continued pre-training on approximately 48 billion tokens across five key SEA languages: English, Indonesian, Tamil, Thai, and Vietnamese.

Key Capabilities & Features

  • Multilingual Proficiency: Specialized in English and four major Southeast Asian languages, enabling robust performance in these linguistic contexts.
  • Llama 3 Architecture: Leverages the efficient and powerful Llama 3 decoder model architecture.
  • Extensive Pre-training: Benefits from continued pre-training on a diverse dataset, including language-specific corpora like SEA-LION Pile and WangChanBERTa, totaling 48 billion tokens.
  • General Language Tasks: Evaluated on the BHASA benchmark for tasks such as Question Answering, Sentiment Analysis, Toxicity Detection, Translation, Summarization, Causal Reasoning, and Natural Language Inference.
  • Community License: Released under the Llama3 Community License, facilitating broader use and development.

When to Use This Model

  • Southeast Asian Applications: Ideal for developers building applications that require strong language understanding and generation in English, Indonesian, Tamil, Thai, or Vietnamese.
  • Multilingual Chatbots & Assistants: Suitable for creating conversational AI systems tailored for the SEA market.
  • Research & Development: Provides a strong base for further fine-tuning or research into multilingual LLMs for the region.

Note: This model has not been aligned for safety, and users are advised to perform their own safety fine-tuning.