Name: Sahabat-AI/llama3-8b-cpt-sahabatai-v1-base API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Sahabat-AI

Sahabat-AI/llama3-8b-cpt-sahabatai-v1-base Overview

This model is an 8 billion parameter Llama 3-based language model, part of the Sahabat-AI ecosystem, co-initiated by Indonesian tech and telecommunication companies GoTo Group and Indosat Ooredoo Hutchison. It was developed by PT GoTo Gojek Tokopedia Tbk and AI Singapore, building upon the AI Singapore-Llama-3-8B-Sea-Lion v2.1-Instruct model.

Key Capabilities & Training

The model has undergone continued pre-training on approximately 50 billion tokens, with a significant focus on Indonesian (55%), Javanese (3%), and Sundanese (1.5%) data, alongside English and other general datasets. It utilizes the default Llama-3-8B tokenizer and supports an 8192-token context length. Training was conducted on 32 Nvidia H100 80GB GPUs for 5 days using MosaicML Composer.

Benchmark Performance

Evaluated on the SEA HELM (BHASA) benchmark, which covers tasks like QA, Sentiment Analysis, Toxicity Detection, Translation, Summarization, Causal Reasoning, and NLI across Indonesian, Javanese, and Sundanese. The sahabatai-v1-8B model achieved an overall score of 59.437, demonstrating strong performance in these languages, particularly in Javanese (65.048) and Sundanese (59.809). While its English performance on the HuggingFace LLM Leaderboard tasks (average 13.92) is lower compared to some English-centric models, its strength lies in its specialized multilingual capabilities for Southeast Asian languages.

Ideal Use Cases

Applications requiring robust understanding and generation in Indonesian, Javanese, and Sundanese.
Developing AI-based services and applications tailored for the Indonesian market.
Research and development focusing on low-resource Southeast Asian languages.

Overview

Sahabat-AI/llama3-8b-cpt-sahabatai-v1-base Overview

Key Capabilities & Training

Benchmark Performance

Ideal Use Cases

Full Model Card (README)