scb10x/llama3.1-typhoon2-8b is an 8 billion parameter Thai large language model, based on the Llama3.1 architecture, developed by scb10x. This model is specifically pretrained for the Thai language, demonstrating strong performance across various Thai-specific benchmarks. It is designed as a base model for further fine-tuning or instruction-following tasks in both Thai and English contexts.
Loading preview...
Model Overview
scb10x/llama3.1-typhoon2-8b is an 8 billion parameter large language model primarily focused on the Thai language. Built upon the Llama3.1-8B architecture, this model is pretrained to excel in Thai linguistic tasks while also supporting English. It is released under the Llama 3.1 Community License.
Key Capabilities & Performance
The model demonstrates notable performance improvements over its Llama3.1 8B and Typhoon1.5 Llama3 8B predecessors on several Thai benchmarks. For instance, it achieves 51.20% on ThaiExam and 49.38% on ONET, outperforming Llama3.1 8B's 45.80% and 38.27% respectively. It also shows strong results in A-Level (43.30%) and M3Exam (47.52%) for Thai-specific academic evaluations. The model is a decoder-only architecture and requires transformers version 4.45.0 or newer.
Intended Uses & Limitations
As a pretrained base model, Llama3.1-Typhoon2-8B is suitable for developers looking to build applications requiring strong Thai language understanding and generation. It can be effectively used for tasks that benefit from one-shot or few-shot learning, or as a foundation for instruction fine-tuning. However, users should be aware that, as a base model, it may not inherently follow complex instructions without further fine-tuning and lacks built-in moderation mechanisms, potentially generating inappropriate responses.