AksaraLLM/Kiel-Pro-0.5B-v3-chat

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:0.5BQuant:BF16Ctx Length:32kPublished:May 2, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

AksaraLLM's Kiel-Pro-0.5B-v3-chat is a 494 million parameter, Qwen2-based Indonesian language model with a 32768 token context length. This variant is specifically fine-tuned to consistently identify itself as 'Kiel-Pro from AksaraLLM' and respond cleanly in Indonesian, addressing identity confusion and garbled outputs present in its base model. It is optimized for chat-based interactions where clear self-identification and consistent Indonesian language output are crucial.

Loading preview...

AksaraLLM/Kiel-Pro-0.5B-v3-chat: Identity-Calibrated Indonesian LLM

This model is an identity-calibrated variant of AksaraLLM's 494 million parameter, Qwen2-based Indonesian language model, Kiel-Pro-0.5B-v3. It was fine-tuned using LoRA on 50 hand-written Indonesian identity prompts to resolve issues where the base model would misidentify itself (e.g., as Qwen) or produce garbled, non-Indonesian text when asked about its identity or origin.

Key Capabilities & Improvements

  • Consistent Self-Identification: The model reliably identifies itself as "Kiel-Pro, model bahasa Indonesia dari proyek AksaraLLM" when prompted with questions like "Siapa kamu?" or "Kamu model apa?".
  • Clean Indonesian Output: It significantly reduces garbled trailing tokens and maintains coherent Indonesian responses, particularly in chat contexts.
  • Qwen2-based Architecture: Inherits the underlying Qwen2 architecture and general language modeling behavior from its base model.
  • Standalone Weights: The LoRA adapter was merged into the full model, meaning it contains standalone weights without PEFT adapter dependencies.

Limitations & Use Cases

As a 494M parameter model, it shares the same limitations as its base, including a tendency to hallucinate factual information (e.g., numbers, dates, benchmarks). It is not recommended for use as a reliable source of factual information without external retrieval or verification layers. Its perplexity and general language modeling capabilities remain unchanged from the base model. This model is best suited for applications requiring a small, efficient Indonesian chat model where consistent self-identity and clean, non-garbled Indonesian responses are prioritized over factual accuracy or complex reasoning tasks.