aisingapore/Llama-SEA-LION-v2-8B-IT

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kTool Calling:SupportedPublished:Jul 30, 2024License:llama3Architecture:Transformer0.0K Cold

Llama-SEA-LION-v2-8B-IT is an 8 billion parameter decoder-only language model developed by AI Singapore, built upon the Llama3 architecture with an 8192 token context length. It is specifically instruction-tuned for Southeast Asian languages, including Indonesian, Thai, Vietnamese, and Tamil, in addition to English. This model excels in general language understanding and instruction-following tasks across these languages, making it suitable for applications requiring multilingual capabilities in the SEA region.

Loading preview...

Model Overview

Llama-SEA-LION-v2-8B-IT is an 8 billion parameter instruction-tuned language model developed by AI Singapore, based on the Llama3 architecture. It features an 8192 token context length and utilizes the default Llama 3 8B Instruct tokenizer. This model is specifically designed and instruction-tuned for the Southeast Asian (SEA) region, supporting English, Indonesian, Thai, Vietnamese, and Tamil.

Key Capabilities & Features

  • Multilingual Support: Instruction-tuned in English and several ASEAN languages, including Indonesian, Thai, and Vietnamese.
  • General Language Understanding: Evaluated using the SEA-HELM (BHASA) benchmark across tasks like Question Answering, Sentiment Analysis, Toxicity Detection, Translation, Summarization, Causal Reasoning, and Natural Language Inference.
  • Instruction Following: Assessed on SEA-IFEval for adherence to prompt constraints and SEA-MTBench for multi-turn conversational abilities, with datasets localized and translated by linguists.

Use Cases & Considerations

This model is well-suited for applications requiring robust language understanding and instruction-following in a multilingual context, particularly within Southeast Asian languages. Developers should note that the model has not been aligned for safety and may exhibit limitations such as hallucination or occasional irrelevant content, requiring users to perform their own safety fine-tuning and validation.