AksaraLLM/AksaraLLM-Qwen-1.5B
AksaraLLM/AksaraLLM-Qwen-1.5B is an Indonesian-tuned Qwen2 1.5B (1.78 billion parameters) causal language model developed by AksaraLLM. This model is specifically optimized for Indonesian language tasks, demonstrating a perplexity of 8.4 on short Indonesian sentences and a 0.0% English-stopword ratio in Indonesian-prompted output. It serves as the production checkpoint for the AksaraLLM-Qwen-1.5B series, making it suitable for applications requiring high-quality Indonesian text generation and understanding.
Loading preview...
AksaraLLM-Qwen-1.5B: Indonesian-Tuned Qwen2 Model
AksaraLLM-Qwen-1.5B is a specialized causal language model, a production checkpoint in the AksaraLLM-Qwen-1.5B series, built upon the Qwen2 architecture. With 1.78 billion parameters, this model is meticulously tuned for the Indonesian language, aiming to provide high-quality text generation and comprehension capabilities for Indonesian-centric applications.
Key Capabilities & Performance
- Indonesian Language Optimization: Achieves a perplexity of 8.4 on a baseline audit using 50 short Indonesian sentences, indicating strong language modeling performance.
- Minimal English Interference: Demonstrates a 0.0% English-stopword ratio in Indonesian-prompted output, ensuring clean and contextually appropriate Indonesian text generation.
- Efficient Size: At 1.78 billion parameters, it offers a balance between performance and computational efficiency, making it accessible for various deployment scenarios.
- Rolling Production Tag: This repository represents the current production version, with previous fixed-snapshot variants like
AksaraLLM/AksaraLLM-Qwen-1.5B-v5-publicalso available.
Ideal Use Cases
This model is particularly well-suited for developers and organizations focused on:
- Indonesian Content Generation: Creating articles, summaries, or creative text in Indonesian.
- Indonesian Chatbots and Virtual Assistants: Powering conversational AI systems that interact in Indonesian.
- Language Understanding Tasks: Applications requiring analysis or processing of Indonesian text.
Known Considerations
Users should be aware of minor configuration adjustments, such as setting tie_word_embeddings: false to silence a warning, and the absence of a default chat template, which may require manual addition of Qwen2 ChatML for conversational use cases.