typhoon-ai/typhoon2-qwen2.5-7b-instruct
Typhoon2-Qwen2.5-7B-Instruct is a 7.6 billion parameter instruction-tuned large language model developed by Typhoon AI, based on the Qwen2.5 architecture. This model specializes in Thai language processing, demonstrating strong performance in instruction-following, function calling, and specific domains like mathematics and coding in both Thai and English. It is particularly optimized for long context understanding, supporting up to 128k tokens through YaRN scaling.
Loading preview...
Typhoon2-Qwen2.5-7B-Instruct: Thai-Centric LLM
Typhoon2-Qwen2.5-7B-Instruct is a 7.6 billion parameter instruction-tuned model built upon the Qwen2.5 architecture, developed by Typhoon AI. Its primary focus is on the Thai language, while also maintaining strong capabilities in English. The model excels in various tasks, including instruction-following, function calling, and domain-specific applications like mathematics and coding.
Key Capabilities & Performance
- Bilingual Proficiency: Demonstrates high performance in both Thai and English across instruction-following (IFEval, MT-Bench) and function calling tasks.
- Domain Specialization: Shows significant strengths in mathematical reasoning (GSM8K, MATH) and code generation (HumanEval, MBPP) for both languages, often outperforming base Qwen2.5-7B-Instruct in Thai benchmarks.
- Long Context Handling: Supports an extended context length of 128k tokens, utilizing the YaRN technique for efficient processing of lengthy texts. The
rope_scalingconfiguration can be applied for optimal long-text performance. - Function Calling: Features robust function calling capabilities, as evidenced by its performance in FC-TH and FC-EN benchmarks.
Intended Uses & Limitations
This model is designed as an instructional model, suitable for a wide range of tasks including analysis, question answering, math, coding, and creative writing. While it incorporates guardrails, users should be aware that it is still under development and may occasionally produce inaccurate, biased, or objectionable responses. Developers are advised to assess risks based on their specific use case. The model is licensed under Apache-2.0.