Typhoon2-Qwen2.5-7B-Instruct: Thai-Centric LLM

Typhoon2-Qwen2.5-7B-Instruct is a 7.6 billion parameter instruction-tuned model built upon the Qwen2.5 architecture, developed by Typhoon AI. Its primary focus is on the Thai language, while also maintaining strong capabilities in English. The model excels in various tasks, including instruction-following, function calling, and domain-specific applications like mathematics and coding.

Key Capabilities & Performance

Bilingual Proficiency: Demonstrates high performance in both Thai and English across instruction-following (IFEval, MT-Bench) and function calling tasks.
Domain Specialization: Shows significant strengths in mathematical reasoning (GSM8K, MATH) and code generation (HumanEval, MBPP) for both languages, often outperforming base Qwen2.5-7B-Instruct in Thai benchmarks.
Long Context Handling: Supports an extended context length of 128k tokens, utilizing the YaRN technique for efficient processing of lengthy texts. The rope_scaling configuration can be applied for optimal long-text performance.
Function Calling: Features robust function calling capabilities, as evidenced by its performance in FC-TH and FC-EN benchmarks.

Intended Uses & Limitations

This model is designed as an instructional model, suitable for a wide range of tasks including analysis, question answering, math, coding, and creative writing. While it incorporates guardrails, users should be aware that it is still under development and may occasionally produce inaccurate, biased, or objectionable responses. Developers are advised to assess risks based on their specific use case. The model is licensed under Apache-2.0.

Overview

Typhoon2-Qwen2.5-7B-Instruct: Thai-Centric LLM

Key Capabilities & Performance

Intended Uses & Limitations

Full Model Card (README)