aisingapore/Qwen-SEA-LION-v4-32B-IT
Hugging Face
TEXT GENERATIONConcurrency Cost:2Model Size:32BQuant:FP8Ctx Length:32kPublished:Oct 16, 2025Architecture:Transformer0.0K Warm

Qwen-SEA-LION-v4-32B-IT is a 32 billion parameter instruction-tuned decoder-only large language model developed by AI Singapore. Based on the Qwen3 architecture, it underwent continued pre-training on 100 billion tokens from the SEA-Pile v2 corpus, specifically targeting seven Southeast Asian languages. This model is optimized for multilingual understanding and generation within the Southeast Asian context, supporting a 32K token context length.

Loading preview...

Qwen-SEA-LION-v4-32B-IT: Southeast Asian Language Model

Qwen-SEA-LION-v4-32B-IT is a 32 billion parameter instruction-tuned model developed by AI Singapore, building upon the Qwen3 architecture. It is specifically designed for the Southeast Asian (SEA) region, having undergone extensive continued pre-training on approximately 100 billion tokens from the SEA-Pile v2 corpus, which includes data from Burmese, Indonesian, Malay, Filipino, Tamil, Thai, and Vietnamese. The model was further refined through post-training on 8 million high-quality question-and-answer pairs.

Key Capabilities & Features

  • Multilingual Proficiency: Enhanced understanding and generation in seven key Southeast Asian languages, in addition to English and other languages supported by Qwen3.
  • Extended Context Length: Inherits a native 32,768 token context length from its Qwen3 base.
  • Instruction Following: Fine-tuned for instruction-following and multi-turn chat capabilities, evaluated using SEA-IFEval and SEA-MTBench.
  • Text-Exclusive: Primarily focused on text-based tasks; vision capabilities are comparable to the base Qwen3-32B model.

Use Cases & Considerations

This model is ideal for applications requiring strong language understanding and generation in a Southeast Asian context. Developers should note that the model has not been aligned for safety and requires further safety fine-tuning for production use. It may exhibit limitations such as hallucination and occasional irrelevant content, similar to many LLMs. Evaluation was conducted zero-shot across various tasks including QA, sentiment analysis, translation, and cultural knowledge using the SEA-HELM evaluation benchmark.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p