aitfindonesia/KomdigiUB-8B-Base

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Dec 10, 2025License:apache-2.0Architecture:Transformer Open Weights Cold

KomdigiUB-8B-Base by Tim 1 AITF is an 8 billion parameter Indonesian causal language model built on the Qwen3-8B architecture, optimized for Continued Pre-Training (CPT) in digital policy and oversight domains. It utilizes LoRA and 4-bit quantization for efficiency and is specifically designed for adapting to Indonesian public policy and digital regulation contexts. The model demonstrates a validation perplexity of ~3.55 and achieves ~65.66 on IndoMMLU, making it suitable for domain-specific knowledge enrichment and pre-adaptation before further fine-tuning.

Loading preview...

Model Overview

KomdigiUB-8B-Base, developed by Tim 1 AITF, is an 8 billion parameter Indonesian causal language model. It is built upon the Qwen3-8B architecture and employs LoRA (Low-Rank Adaptation) with 4-bit quantization for efficient training and deployment. The model's primary language is Indonesian and it is licensed under Apache-2.0.

Key Capabilities & Training

This model is specifically designed for Continued Pre-Training (CPT), focusing on the domain of digital policy and oversight. Its training data, totaling approximately 214 million tokens, is heavily weighted towards Digital Talent Policy (DTP) and Pengawasan Ruang Digital (PRD), alongside general Indonesian knowledge from Wikipedia. The training procedure involved bf16 precision, 4-bit quantization, and an effective batch size of 32 over 1 epoch. Evaluation metrics show a final validation loss of ~1.264 and a validation perplexity of ~3.55. Benchmarks include ~65.66 on IndoMMLU and ~75.80 on XCOPA-ID.

Intended Use Cases

  • Domain Adaptation: Ideal for adapting to public policy and digital regulation domains in Indonesia.
  • Knowledge Enrichment: Useful for enriching specific Indonesian knowledge bases.
  • Pre-adaptation: Serves as a strong base for further Instruction Tuning or Supervised Fine-Tuning (SFT) before deployment to end-users.

Users are recommended to use the Qwen3 chat template and perform additional fine-tuning for chat-oriented instruction following or long-context conversations, as the model has not undergone preference alignment.