typhoon-ai/llama3.2-typhoon2-1b
typhoon-ai/llama3.2-typhoon2-1b is a 1 billion parameter, decoder-only Thai large language model based on the Llama3.2 architecture, developed by typhoon-ai. This model is pretrained specifically for the Thai language, demonstrating strong performance on various Thai-specific benchmarks. It is primarily designed for applications requiring robust Thai language understanding and generation.
Loading preview...
Overview
Llama3.2-Typhoon2-1B is a 1 billion parameter, decoder-only large language model developed by typhoon-ai, built upon the Llama3.2 architecture. It is specifically pretrained for the Thai language, making it a specialized resource for Thai natural language processing tasks. The model also supports English.
Key Capabilities
- Thai Language Specialization: Achieves competitive performance on various Thai benchmarks, including ThaiExam, ONET, TGAT, and TPAT, often outperforming Llama3.1 1B in these specific areas.
- Llama Architecture: Benefits from the robust and widely adopted Llama architecture.
- Context Length: Features a context length of 32768 tokens, allowing for processing longer sequences of text.
Intended Uses & Limitations
This model is a pretrained base model, meaning it may require further fine-tuning or few-shot learning to effectively follow complex human instructions. As a base model, it does not include built-in moderation mechanisms and may produce harmful or inappropriate content. Developers should implement their own safety measures when deploying this model.
Performance Highlights
The model demonstrates strong results across several Thai-specific academic and general knowledge tests. For instance, it scores 26.83% on ThaiExam, 19.75% on ONET, and 49.23% on TGAT, indicating its proficiency in understanding and processing Thai educational content. A detailed technical report is available on arXiv.