aisingapore/Llama-SEA-LION-v3-70B

TEXT GENERATIONConcurrency Cost:4Model Size:70BQuant:FP8Ctx Length:32kPublished:Dec 11, 2024License:llama3.1Architecture:Transformer0.0K Cold

Llama-SEA-LION-v3-70B is a 70 billion parameter multilingual large language model developed by AI Singapore, built upon the Llama 3.1 architecture. It has undergone continued pre-training on approximately 200 billion tokens across 11 Southeast Asian languages, including Burmese, Chinese, English, Filipino, Indonesian, Khmer, Lao, Malay, Tamil, Thai, and Vietnamese. This model is specifically optimized for general language capabilities and constraint-following behavior within the Southeast Asian linguistic context, making it suitable for applications requiring strong performance in these languages.

Loading preview...

Llama-SEA-LION-v3-70B: Multilingual LLM for Southeast Asia

Llama-SEA-LION-v3-70B is a 70 billion parameter large language model developed by AI Singapore, building on the Llama 3.1-70B-Instruct architecture. It has been extensively pre-trained on approximately 200 billion tokens across 11 Southeast Asian (SEA) languages: Burmese, Chinese, English, Filipino, Indonesian, Khmer, Lao, Malay, Tamil, Thai, and Vietnamese. This continued pre-training aims to enhance its understanding and generation capabilities specifically for the SEA region.

Key Capabilities

  • Multilingual Proficiency: Strong performance across 11 SEA languages due to targeted pre-training.
  • General Language Tasks: Evaluated on tasks such as Question Answering, Sentiment Analysis, Toxicity Detection, Translation, Abstractive Summarization, Causal Reasoning, and Natural Language Inference using the SEA-HELM benchmark.
  • Constraint Following: Assessed for its ability to adhere to specific instructions and constraints in both English and SEA languages via SEA-IFEval, a localized version of IFEval.
  • Extensive Training Data: Utilizes a diverse dataset including SEA-LION Pile v1 and v2, Dolma, Fineweb-Edu, StackV2 (for code), and other language-specific corpora.

Good For

  • Applications requiring robust language understanding and generation in multiple Southeast Asian languages.
  • Developers building solutions for multilingual contexts within the SEA region.
  • Research and development focused on improving LLM performance for less-resourced languages.