aisingapore/Gemma-SEA-LION-v3-9B-IT

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:9BQuant:FP8Ctx Length:16kPublished:Oct 30, 2024License:gemmaArchitecture:Transformer0.0K Warm

Gemma-SEA-LION-v3-9B-IT is a 9 billion parameter instruction-tuned decoder-only large language model developed by AI Singapore, based on the Gemma2 architecture. It is specifically pretrained and instruct-tuned for the Southeast Asia (SEA) region, supporting 13 languages including Burmese, Chinese, English, Filipino, Indonesian, Javanese, Khmer, Lao, Malay, Sundanese, Tamil, Thai, and Vietnamese. With a context length of 8192 tokens, this model excels at instruction-following tasks in a multilingual Southeast Asian context.

Loading preview...

Gemma-SEA-LION-v3-9B-IT: Southeast Asian Language Model

Gemma-SEA-LION-v3-9B-IT is a 9 billion parameter instruction-tuned model developed by AI Singapore, building upon the Gemma2 architecture. It is part of the SEA-LION (Southeast Asian Languages In One Network) collection, specifically designed and optimized for the Southeast Asian region.

Key Capabilities & Features

  • Multilingual Support: Supports 13 languages: Burmese, Chinese, English, Filipino, Indonesian, Javanese, Khmer, Lao, Malay, Sundanese, Tamil, Thai, and Vietnamese.
  • Instruction-Tuned: Fine-tuned for instruction-following in both English and various ASEAN languages.
  • Gemma2 Architecture: Utilizes the Gemma2 decoder model for its base architecture.
  • Context Length: Features a context length of 8192 tokens.
  • Evaluated Performance: Benchmarked using the SEA-HELM evaluation framework for general language capabilities (QA, Sentiment, Translation, Summarization, etc.) and instruction-following capabilities with localized IFEval and MT-Bench datasets.

Intended Use Cases

This model is suitable for applications requiring strong instruction-following and language generation across a diverse set of Southeast Asian languages. It is particularly useful for tasks like question answering, sentiment analysis, translation, and conversational AI within the SEA context. Developers should note that the model has not been aligned for safety and requires custom safety fine-tuning for production deployment.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p