Qwen-SEA-LION-v4-32B-IT: Southeast Asian Language Model

Qwen-SEA-LION-v4-32B-IT is a 32 billion parameter instruction-tuned model developed by AI Singapore, building upon the Qwen3 architecture. It is part of the SEA-LION (Southeast Asian Languages In One Network) collection, specifically designed to enhance language understanding and generation for the Southeast Asian region.

Key Capabilities & Features

Multilingual Proficiency: Continued pre-training on approximately 100 billion tokens from the SEA-Pile v2 corpus, covering 7 key Southeast Asian languages: Burmese, Indonesian, Malay, Filipino, Tamil, Thai, and Vietnamese, in addition to English.
Instruction Following: Post-trained on 8 million high-quality question-and-answer pairs to improve instruction-following and multi-turn chat capabilities.
Extended Context Window: Inherits a native context length of 32,768 tokens from its Qwen3-32B base.
Evaluation: Evaluated using specialized benchmarks like SEA-HELM for general language tasks, SEA-IFEval for instruction adherence, and SEA-MTBench for multi-turn chat, with results available on the SEA-LION leaderboard.

Use Cases & Considerations

This model is particularly well-suited for applications requiring strong performance in Southeast Asian languages and complex instruction following. Developers should note that the model has not been aligned for safety and requires further fine-tuning for safety-critical applications. It supports a "thinking mode" feature for enhanced response generation.

Overview

Qwen-SEA-LION-v4-32B-IT: Southeast Asian Language Model

Key Capabilities & Features

Use Cases & Considerations

Full Model Card (README)