Llama-Sahabat-AI-v2-70B-IT: An Indonesian-Focused LLM

Llama-Sahabat-AI-v2-70B-IT is a 70 billion parameter instruction-tuned decoder model, developed by PT GoTo Gojek Tokopedia Tbk and AI Singapore, with co-initiation from Indosat Ooredoo Hutchison. It is built upon the Llama 3.1 architecture and utilizes its default tokenizer, featuring a substantial 128k context length.

Key Capabilities

Multilingual Support: Supports English, Indonesian, Javanese, Sundanese, Batak Toba, and Balinese.
Indonesian Context Understanding: Evaluated on the IndoMMLU benchmark for local Indonesian humanities, language, culture, social science, and STEM tasks.
General Language Proficiency: Assessed using the SEA-HELM benchmark for tasks like QA, Sentiment Analysis, Translation, and Summarization.
Instruction Following: Performance on instruction adherence and multi-turn conversations is evaluated with localized SEA-IFEval and SEA-MTBench datasets, using gpt-4-1106-preview as a judge.

Usage Considerations

Running this model requires significant resources, with a minimum of 140 GB of VRAM for FP16 or BF16 precision. Recommended setups include 4x NVIDIA L40s or 2x NVIDIA H100 GPUs. The model, like many LLMs, may hallucinate or generate irrelevant content, and users should validate its responses. Further details on performance are available on the Sahabat-AI leaderboard.

Overview

Llama-Sahabat-AI-v2-70B-IT: An Indonesian-Focused LLM

Key Capabilities

Usage Considerations

Full Model Card (README)