aisingapore/Gemma-SEA-LION-v4-4B-VL
Gemma-SEA-LION-v4-4B-VL is a 4-billion parameter Vision-Language Model (VLM) developed by AI Singapore, built upon the gemma-3-4b-it architecture with a 128K token context length. It is specifically post-trained on 6.7 million instruction-text pairs for Southeast Asian (SEA) languages, including Indonesian, Vietnamese, Thai, Filipino, Tamil, Burmese, and Malay. This model excels in multilingual and multicultural fluency for the SEA region, incorporating tool-calling capabilities and visual parsing in Thai, Chinese, and English.
Loading preview...
Overview
Gemma-SEA-LION-v4-4B-VL is a 4-billion parameter Vision-Language Model (VLM) developed by AI Singapore, based on the gemma-3-4b-it architecture. It features a substantial context length of 128K tokens and has undergone extensive post-training on approximately 6.7 million instruction-text pairs. This training specifically targets Southeast Asian (SEA) languages and cultural nuances, enhancing its multilingual and multicultural fluency.
Key Capabilities
- Multilingual Fluency: Optimized for Indonesian, Vietnamese, Thai, Filipino, Tamil, Burmese, and Malay.
- Vision-Language Integration: Inherits image and text capabilities from its base model, with enhanced visual parsing in Thai, Chinese, and English.
- Tool Calling: Includes function calling capabilities, enabling its use in tool-calling applications.
- Robust Evaluation: Evaluated using SEA-HELM for general language capabilities, SEA-IFEval for instruction-following, and SEA-MTBench for multi-turn chat, with specific metrics for various tasks including QA, sentiment analysis, and translation.
When to Use This Model
This model is particularly well-suited for applications requiring:
- Multilingual processing in Southeast Asian languages.
- Vision-language tasks with a focus on text extraction in Thai, Chinese, and English.
- Tool-augmented conversational agents or applications that benefit from function calling.
It is important to note that the model has not been aligned for safety, and users should implement their own safety measures.