janakhpon/mon-lm-qwen2.5-1.5b
janakhpon/mon-lm-qwen2.5-1.5b is a 1.5 billion parameter Large Language Model developed by janakhpon, based on the Qwen2.5 architecture. It is specifically designed and continually pre-trained for the Mon language (mnw), featuring an expanded tokenizer with approximately 3,000 Mon-specific tokens and a 32768-token context length. This model excels at generating and understanding text in Mon, making it ideal for Mon language processing applications.
Loading preview...
Mon-LM (Qwen2.5-1.5B) Overview
janakhpon/mon-lm-qwen2.5-1.5b is a specialized Large Language Model (LLM) with 1.5 billion parameters, built upon the robust Qwen2.5 architecture. Its primary distinction lies in its dedicated focus on the Mon language (mnw).
Key Capabilities & Features
- Mon Language Specialization: The model has undergone Continual Pre-Training (CPT) using QLoRA on an extensive Mon language corpus, making it highly proficient in Mon.
- Expanded Tokenizer: The base Qwen2.5 tokenizer has been significantly expanded to include approximately 3,000 Mon-specific tokens (SentencePiece Unigram), optimizing its understanding and generation of Mon text. This expansion involved injecting Mon subwords into the embedding layer to improve compression ratio and linguistic atomicity.
- NFC Normalization: All Mon text processed during training was NFC normalized, ensuring consistent character representation.
- Context Length: It supports a substantial context length of 32768 tokens, allowing for processing longer Mon texts.
Use Cases
This model is particularly well-suited for applications requiring deep understanding and generation of the Mon language. Potential use cases include:
- Mon language translation systems.
- Content generation in Mon.
- Mon language research and linguistic analysis.
- Educational tools for Mon speakers or learners.
Developed as part of the Mon Language AI initiative, this model represents a significant step forward for Mon language technology.