Overview
Typhoon2-Qwen2.5-7B-Instruct is a 7.6 billion parameter instruction-tuned large language model developed by scb10x, built upon the Qwen2.5 architecture. It is primarily designed for Thai language processing but also supports English, making it a strong choice for bilingual applications. The model demonstrates enhanced performance in Thai across various benchmarks, including instruction-following, function calling, and domain-specific tasks.
Key Capabilities
- Bilingual Proficiency: Excels in Thai language tasks, outperforming the base Qwen2.5 7B Instruct model in Thai IFEval, MT-Bench TH, Thai Code-Switching, FC-TH, GSM8K-TH, and MATH-TH.
- Function Calling: Shows strong capabilities in function calling for both Thai and English, with 74.24% and 75.44% accuracy respectively.
- Long Context Handling: Supports a 128k context length and can be further extended using YaRN (Yet another RoPE-scaling method) for processing extremely long texts, though static YaRN in vLLM may impact performance on shorter inputs.
- Domain-Specific Performance: Achieves notable results in Thai math (GSM8K-TH 79.07%, MATH-TH 55.42%) and coding (HumanEval-TH 73.2%, MBPP-TH 78.3%).
Intended Uses
This model is an instructional model suitable for a wide range of applications requiring strong Thai language understanding and generation, including chatbots, content creation, and code assistance. Developers should be aware that it is still under development and may produce occasional inaccuracies or biases, necessitating risk assessment for specific use cases. For more technical details, refer to the arXiv paper.