aisingapore/Qwen-SEA-LION-v4-32B-IT

Hugging Face
TEXT GENERATIONConcurrency Cost:2Model Size:32BQuant:FP8Ctx Length:32kPublished:Oct 16, 2025License:mitArchitecture:Transformer0.0K Open Weights Warm

Qwen-SEA-LION-v4-32B-IT is a 32 billion parameter instruction-tuned decoder-only language model developed by AI Singapore, based on the Qwen3 architecture. It underwent continued pre-training on 100 billion tokens from the SEA-Pile v2 corpus, specifically optimized for Southeast Asian languages including Burmese, Indonesian, Malay, Filipino, Tamil, Thai, and Vietnamese. With a 32k context length, this model excels in multilingual instruction-following and chat capabilities across the SEA region.

Loading preview...

Qwen-SEA-LION-v4-32B-IT: Southeast Asian Language Model

Qwen-SEA-LION-v4-32B-IT is a 32 billion parameter instruction-tuned model developed by AI Singapore, building upon the Qwen3 architecture. It is part of the SEA-LION (Southeast Asian Languages In One Network) collection, specifically designed to enhance language understanding and generation for the Southeast Asian region.

Key Capabilities & Features

  • Multilingual Proficiency: Continued pre-training on approximately 100 billion tokens from the SEA-Pile v2 corpus, covering 7 key Southeast Asian languages: Burmese, Indonesian, Malay, Filipino, Tamil, Thai, and Vietnamese, in addition to English.
  • Instruction Following: Post-trained on 8 million high-quality question-and-answer pairs to improve instruction-following and multi-turn chat capabilities.
  • Extended Context Window: Inherits a native context length of 32,768 tokens from its Qwen3-32B base.
  • Evaluation: Evaluated using specialized benchmarks like SEA-HELM for general language tasks, SEA-IFEval for instruction adherence, and SEA-MTBench for multi-turn chat, with results available on the SEA-LION leaderboard.

Use Cases & Considerations

This model is particularly well-suited for applications requiring strong performance in Southeast Asian languages and complex instruction following. Developers should note that the model has not been aligned for safety and requires further fine-tuning for safety-critical applications. It supports a "thinking mode" feature for enhanced response generation.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p