aisingapore/Gemma-SEA-LION-v3-9B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:9BQuant:FP8Ctx Length:16kPublished:Oct 30, 2024License:gemmaArchitecture:Transformer0.0K Warm

Gemma-SEA-LION-v3-9B is a 9 billion parameter decoder-only language model developed by AI Singapore, built upon the Gemma 2 architecture. It has undergone continued pre-training on approximately 200 billion tokens across 11 Southeast Asian languages, including English, Chinese, Vietnamese, and Indonesian. This model is specifically optimized for multilingual applications within the Southeast Asian region, excelling in general language capabilities across various tasks like question answering and sentiment analysis.

Loading preview...

Gemma-SEA-LION-v3-9B: Southeast Asian Multilingual LLM

Gemma-SEA-LION-v3-9B is a 9 billion parameter large language model developed by AI Singapore, specifically designed for the Southeast Asian (SEA) region. It is built on the Gemma 2 architecture and has undergone extensive continued pre-training on approximately 200 billion tokens. This training data encompasses 11 official Southeast Asian languages: English, Chinese, Vietnamese, Indonesian, Thai, Tamil, Filipino, Malay, Khmer, Lao, and Burmese.

Key Capabilities

  • Multilingual Proficiency: Strong performance across a diverse set of Southeast Asian languages, making it suitable for regional applications.
  • General Language Tasks: Evaluated on the SEA-HELM benchmark for tasks such as Question Answering, Sentiment Analysis, Toxicity Detection, Translation, Abstractive Summarization, Causal Reasoning, and Natural Language Inference.
  • Continued Pre-training: Leverages a significant volume of regional language data, including specialized datasets like SEA-LION Pile v1 and v2, to enhance its understanding of local linguistic nuances.

Good For

  • Applications requiring robust language understanding and generation in multiple Southeast Asian languages.
  • Developers and researchers focusing on regional NLP challenges and solutions.
  • Tasks that benefit from a model specifically tuned for the linguistic diversity of Southeast Asia.