Typhoon-S-ThaiLLM-8B-Instruct: Sovereign AI for Thai and English
Typhoon-S-ThaiLLM-8B-Instruct is an 8 billion parameter instruction-tuned model developed by Typhoon-AI, built upon the Qwen3 architecture and the ThaiLLM base model. It emphasizes openness and reproducibility, with its training data, code, and a detailed technical report available publicly via arXiv. This research preview aims to demonstrate the feasibility of creating competitive instruction models for sovereign AI, particularly for local languages like Thai, using limited academic resources.
Key Capabilities & Features
- Bilingual Proficiency: Supports both Thai (🇹🇭) and English (🇬🇧) as primary languages.
- Open Post-Training: Utilizes Supervised Fine-Tuning (SFT) and On-policy Distillation (OPD) methods, with all training artifacts openly shared.
- Resource-Efficient Development: Achieved with an academic budget equivalent to two days on a single H100 node, showcasing efficient model development.
- Extended Context Window: Features a 32K token context length, allowing for processing longer inputs and generating more comprehensive responses.
- Tool Use: Supports tool calling functionality, enabling integration with external functions for enhanced capabilities.
Intended Use Cases
- Sovereign AI Development: Ideal for researchers and developers focused on building instruction-tuned models for specific local languages with full transparency.
- Bilingual Applications: Suitable for applications requiring strong performance in both Thai and English.
- Research and Experimentation: Provides a fully open platform for experimenting with post-training techniques and understanding their impact on model performance and catastrophic forgetting.
- Instruction Following: Designed to follow instructions effectively, making it suitable for various NLP tasks like question answering, content generation, and conversational AI.