AksaraLLM/AksaraLLM-Qwen-1.5B-v5-public

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Apr 20, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

AksaraLLM-Qwen-1.5B-v5-public is an AksaraLLM-developed Qwen2-based causal language model with 1.78 billion parameters and a 32,768-token context length. It is specifically fine-tuned for Indonesian language tasks, demonstrating a perplexity of 8.4 on Indonesian text. This model is optimized as a daily-driver Indonesian LM, excelling in coherent factual Indonesian completions while also supporting English.

Loading preview...

AksaraLLM-Qwen-1.5B-v5-public Overview

AksaraLLM-Qwen-1.5B-v5-public is a 1.78 billion parameter language model based on the Qwen2 architecture, developed by AksaraLLM. It features a substantial context length of 32,768 tokens. This model is specifically tuned for the Indonesian language, achieving a perplexity of 8.4 on a baseline audit of 50 short Indonesian sentences, making it a strong performer in its size class for Indonesian text generation.

Key Capabilities & Characteristics

  • Indonesian Language Proficiency: Demonstrates high coherence and factual accuracy in Indonesian completions, serving as a recommended daily-driver Indonesian LM.
  • Bilingual Support: Capable of generating coherent English text, indicating bilingual (Indonesian/English) functionality.
  • Efficient Performance: Despite its relatively small size (1.78B parameters), it shows competitive performance for Indonesian tasks.
  • Qwen2 Architecture: Built upon the robust Qwen2 framework, ensuring a solid foundation for language understanding and generation.

Recommended Use Cases

  • Indonesian Text Generation: Ideal for applications requiring factual completions, creative writing, or general conversational responses in Indonesian.
  • Bilingual Applications: Suitable for scenarios where both Indonesian and English language processing are needed.
  • Resource-Constrained Environments: Its 1.78B parameter count makes it a viable option for deployment where larger models might be impractical.

Known Issues & Recommendations

  • A tie_word_embeddings configuration bug requires setting tie_word_embeddings: false in config.json to prevent potential model corruption upon re-saving.
  • The model's identity is uncalibrated, meaning it may identify as Qwen. Identity SFT (Supervised Fine-Tuning) is recommended for specific persona alignment.
  • Lacks a bundled chat template, requiring manual application of Qwen2 ChatML for conversational use cases.