aisingapore/Qwen-SEA-LION-v4-8B-VL

VISIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Nov 21, 2025Architecture:Transformer0.0K Cold

The aisingapore/Qwen-SEA-LION-v4-8B-VL is an 8-billion parameter Vision-Language Model (VLM) developed by AI Products Pillar, AI Singapore. Built upon the Qwen3-VL-8B-Instruct architecture, it features a native 256K context window and enhanced vision-language capabilities. This model is specifically fine-tuned for Southeast Asian languages and cultures, supporting English and 7 key SEA languages through extensive supervised fine-tuning on 9 million instruction-text pairs.

Loading preview...

Qwen-SEA-LION-v4-8B-VL: Southeast Asian Vision-Language Model

Qwen-SEA-LION-v4-8B-VL is an 8-billion parameter Vision-Language Model (VLM) developed by AI Products Pillar, AI Singapore. It is based on the Qwen3-VL-8B-Instruct architecture and has been rigorously fine-tuned for Southeast Asian (SEA) languages and cultures. The model underwent supervised fine-tuning (SFT) on approximately 9 million instruction-text pairs to achieve strong multilingual and multicultural fluency.

Key Capabilities

  • Multilingual Support: Proficient in English and 7 key SEA languages: Burmese, Indonesian, Filipino, Malay, Tamil, Thai, and Vietnamese.
  • Vision-Language Integration: Inherits and retains the high-performance vision-language capabilities of the Qwen3-VL base model, including Visual Question Answering (VQA) and Image Captioning.
  • Long Context: Features a native 256K context window, enabling processing of extensive multimodal inputs.
  • Edge-Optimized Inference: Designed for resource-efficient operation.
  • Tool Use: Supports tool use functionalities.

Use Cases

This model is ideal for applications requiring robust language understanding and generation, especially in a Southeast Asian context, combined with advanced vision capabilities. It excels in tasks demanding cultural and linguistic nuance across the specified SEA languages. Evaluation was performed using benchmarks like SEA-HELM, SEA-IFEval, and SEA-MTBench, demonstrating its performance in general language, instruction-following, and multi-turn chat, while also confirming the retention of its strong VL capabilities.