aitf-komdigi/KomdigiUB-8B-Base

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Dec 10, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

KomdigiUB-8B-Base by aitf-komdigi is an 8 billion parameter Indonesian base language model built on the Qwen3-8B architecture, optimized for Continued Pre-Training (CPT) in digital policy and supervision domains. It utilizes LoRA and 4-bit quantization for efficiency and is specifically designed for adapting to public policy and digital regulation contexts. The model excels at enriching specific Indonesian knowledge and serves as a pre-adaptation step before further instruction tuning.

Loading preview...

Model Overview

KomdigiUB-8B-Base, developed by Tim 1 AITF, is an 8 billion parameter Indonesian base language model. It is built upon the Qwen3-8B architecture and employs LoRA (Low-Rank Adaptation) with 4-bit quantization for efficient memory and computation. The model's primary language is Indonesian and it is licensed under Apache-2.0.

Key Characteristics & Training

This model is specifically designed for Continued Pre-Training (CPT), focusing on the domain of digital policy and supervision. Its training data, totaling approximately 214 million tokens, is heavily weighted towards:

  • Digital Talent Policy (DTP): Covering topics like digital occupation, skill trends, and regulations (43.9% of data).
  • Digital Space Supervision (PRD): Including online gambling, hoaxes, child protection, and related policies (42.9% of data).
  • Wikipedia ID: Providing general Indonesian knowledge (13.2% of data).

Training was conducted using bf16 mixed precision and 4-bit quantization, with a LoRA rank of 8. Evaluation results show a final training perplexity of ~3.56 and validation perplexity of ~3.55. General benchmarks include MMLU at ~74.20, IndoMMLU at ~65.66, and XCOPA-ID at ~75.80.

Intended Use Cases

KomdigiUB-8B-Base is recommended for:

  • Domain adaptation in public policy and digital regulation.
  • Enriching specific Indonesian knowledge within these domains.
  • Serving as a pre-adaptation step before further instruction tuning or Supervised Fine-Tuning (SFT).

Users are advised to perform additional evaluation before production deployment and to use the Qwen3 chat template for optimal generation. It is not optimized for long-context conversations or high-stakes decision-making without further fine-tuning.