Model Overview
Gemma2-9b-WangchanLIONv2-instruct is a 9 billion parameter multilingual decoder model, a collaborative effort by AI Singapore and VISTEC. It is built upon the Gemma2 architecture and has been extensively instruction-tuned for the Thai language. The model leverages a substantial dataset of approximately 3.76 million Thai instruction-completion pairs, including human-annotated, FLAN-style automatic, and synthetic samples, to enhance its performance in Thai-specific tasks. It supports both English and Thai languages and uses the default Gemma-2-9B tokenizer with a context length of 8192 tokens.
Key Capabilities
- Thai Language Instruction Following: Optimized for understanding and generating responses based on Thai instructions.
- Multilingual Support: Capable of processing and generating text in both English and Thai.
- Robust Fine-tuning: Developed using parameter-efficient fine-tuning (LoRA) over 3 days on 8x H100-80GB GPUs.
- Evaluated on Thai Benchmarks: Performance assessed using the Thai LLM Benchmark, with results available on the leaderboard.
Good For
- Applications requiring strong Thai language generation and comprehension.
- Research and development in Southeast Asian language models, particularly Thai.
- Instruction-following tasks in Thai, such as question answering, summarization, and creative writing.
Limitations
Users should be aware that the model has not been aligned for safety and may exhibit common LLM limitations such as hallucination and occasional generation of irrelevant content. It is not trained to use a system prompt or tool calling, and developers are advised to perform their own safety fine-tuning.