scb10x/llama3.1-typhoon2-8b-instruct

Warm
Public
8B
FP8
32768
License: llama3.1
Hugging Face
Overview

Overview

scb10x/llama3.1-typhoon2-8b-instruct is an 8 billion parameter instruction-tuned large language model developed by SCB 10X, built upon the Llama3.1-8B foundation. It is primarily designed for the Thai language, while also supporting English. The model features a substantial context length of 32768 tokens, making it suitable for processing longer inputs and generating comprehensive responses.

Key Capabilities

  • Enhanced Thai Language Performance: Significantly outperforms the base Llama3.1-8B model across various Thai benchmarks, including instruction-following (IFEval-TH), MT-Bench TH, and Thai Code-Switching.
  • Robust Function-Calling: Demonstrates strong function-calling capabilities in both Thai and English, as evidenced by its higher scores in FunctionCall-TH and FunctionCall-EN.
  • Domain-Specific Strengths: Shows improved performance in Thai-specific math (GSM8K-TH, MATH-TH) and coding tasks (HumanEval-TH, MBPP-TH).
  • Long Context Understanding: Designed to handle long context inputs effectively, as indicated by its 90k context length in the model description.

Intended Uses & Limitations

This model is an instructional model, suitable for a wide range of tasks including analysis, question answering, math, coding, creative writing, and role-play. While it incorporates guardrails, users should be aware that it is still under development and may occasionally produce inaccurate, biased, or objectionable content. Developers are advised to assess these risks for their specific use cases. For more technical details, refer to the arxiv paper.