AksaraLLM/Kiel-Pro-0.5B-v3

TEXT GENERATIONConcurrency Cost:1Model Size:0.5BQuant:BF16Ctx Length:32kPublished:Apr 13, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

Kiel-Pro-0.5B-v3 is a 494-million parameter Indonesian language model developed by AksaraLLM, based on the Qwen2 architecture. This model is specifically continued-pretrained and fine-tuned for Indonesian text generation, making it the smallest fully functional AksaraLLM model capable of producing coherent Indonesian output. It is designed as a base language model for tasks requiring Indonesian text generation, demonstrating a perplexity of 14.7 on short Indonesian sentences. Its primary strength lies in generating natural and contextually relevant Indonesian text.

Loading preview...

AksaraLLM Kiel-Pro-0.5B-v3: A Compact Indonesian Language Model

AksaraLLM/Kiel-Pro-0.5B-v3 is a 494-million parameter Indonesian language model built upon the Qwen2 architecture. Developed by the AksaraLLM community, this model represents their smallest fully functional offering, designed to generate coherent Indonesian text. It loads cleanly via AutoModelForCausalLM and includes its own tokenizer, making it accessible for developers.

Key Capabilities & Performance

  • Indonesian Text Generation: Specifically fine-tuned for the Indonesian language, it produces coherent and contextually relevant Indonesian text completions.
  • Compact Size: With 494 million parameters, it's a lightweight model suitable for resource-constrained environments.
  • Baseline Performance: Achieves a perplexity of 14.7 on 50 short Indonesian sentences, indicating its proficiency in language modeling for Indonesian.
  • Low English Stopword Ratio: Demonstrates a low English stopword ratio (0.8%) in Indonesian-prompted output, suggesting strong language focus.

Limitations & Recommended Downstream Work

As a base language model, Kiel-Pro-0.5B-v3 has certain limitations:

  • No Chat Template: It functions as a base LM and is not instruction-tuned; a chat_template would need to be added for conversational use.
  • Identity Calibration: May identify itself as "Qwen" due to its lineage; an identity SFT pass is recommended.
  • Hallucinations: Like other models of its size, it is prone to hallucinations, and factual claims require verification.
  • No Production Guardrails: Lacks RLHF or production-ready safety features.

Recommended next steps include identity SFT, adding a Qwen2 ChatML chat_template, and running IndoNLU/IndoMMLU benchmarks for further evaluation.