typhoon-ai/typhoon2-qwen2.5-7b-instruct

TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Dec 16, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Typhoon2-Qwen2.5-7B-Instruct is a 7.6 billion parameter instruction-tuned large language model developed by Typhoon AI, based on the Qwen2.5 architecture. This model specializes in Thai language processing, demonstrating strong performance in instruction-following, function calling, and specific domains like mathematics and coding in both Thai and English. It is particularly optimized for long context understanding, supporting up to 128k tokens through YaRN scaling.

Loading preview...

Typhoon2-Qwen2.5-7B-Instruct: Thai-Centric LLM

Typhoon2-Qwen2.5-7B-Instruct is a 7.6 billion parameter instruction-tuned model built upon the Qwen2.5 architecture, developed by Typhoon AI. Its primary focus is on the Thai language, while also maintaining strong capabilities in English. The model excels in various tasks, including instruction-following, function calling, and domain-specific applications like mathematics and coding.

Key Capabilities & Performance

  • Bilingual Proficiency: Demonstrates high performance in both Thai and English across instruction-following (IFEval, MT-Bench) and function calling tasks.
  • Domain Specialization: Shows significant strengths in mathematical reasoning (GSM8K, MATH) and code generation (HumanEval, MBPP) for both languages, often outperforming base Qwen2.5-7B-Instruct in Thai benchmarks.
  • Long Context Handling: Supports an extended context length of 128k tokens, utilizing the YaRN technique for efficient processing of lengthy texts. The rope_scaling configuration can be applied for optimal long-text performance.
  • Function Calling: Features robust function calling capabilities, as evidenced by its performance in FC-TH and FC-EN benchmarks.

Intended Uses & Limitations

This model is designed as an instructional model, suitable for a wide range of tasks including analysis, question answering, math, coding, and creative writing. While it incorporates guardrails, users should be aware that it is still under development and may occasionally produce inaccurate, biased, or objectionable responses. Developers are advised to assess risks based on their specific use case. The model is licensed under Apache-2.0.