Overview

WangchanLION-v3-IT is a collaborative effort between AI Singapore and VISTEC, focusing on developing Thai-specific Large Language Models (LLMs). This 8 billion parameter model is built upon the Llama3 architecture and has been extensively instruction-tuned using around 3.76 million Thai instruction-completion pairs. It supports both English and Thai languages, with a notable emphasis on Thai instruction following.

Key Capabilities

Thai Instruction Following: Fine-tuned with a large dataset of human-annotated, FLAN-style, and synthetic Thai instructions, enabling it to generate relevant responses to Thai prompts.
Multilingual Support: While primarily focused on Thai, the model also supports English.
Llama3 Architecture: Leverages the robust Llama3 base model, providing a strong foundation for language understanding and generation.
Extended Context Length: Features a context length of 128k tokens, allowing for processing longer inputs and maintaining conversational coherence over extended interactions.

Good For

Thai Language Applications: Ideal for use cases requiring instruction-tuned capabilities in Thai, such as chatbots, content generation, and summarization for Thai text.
Research and Development: Serves as a valuable resource for researchers and developers working on Southeast Asian language models, particularly for Thai.

Limitations

Users should be aware that WangchanLION-v3-IT, like many LLMs, can exhibit hallucinations and may occasionally generate irrelevant or inconsistent content. The model has not been aligned for safety, and developers are advised to implement their own safety fine-tuning and security measures. It is also noted that the model has not been trained to use system prompts or tool calling.

Overview

Overview

Key Capabilities

Good For

Limitations

Full Model Card (README)