uzlm/alloma-1B-Instruct: Uzbek-Optimized Llama Model
The alloma-1B-Instruct is a 1 billion parameter model from the Llama series, developed by Examy.me and Teamwork.uz, with a primary focus on enhancing performance for the Uzbek language. It was continually pretrained with a 2048-token context length on a 2.4 billion token dataset (75% English, 25% Uzbek) and subsequently fine-tuned using Supervised Fine-Tuning (SFT).
Key Differentiators & Capabilities
- Efficient Uzbek Tokenization: Features a customized tokenizer that averages 1.7 tokens per Uzbek word, significantly more efficient than the ~3.5 tokens per word in original Llama models. This results in approximately 2x faster inference and a longer effective context length for Uzbek text.
- Resource-Friendly Deployment: Designed to run efficiently, requiring only 2 GB of VRAM with quantization, making it suitable for deployment on small GPUs, edge devices, and mobile platforms.
- Methodology: Achieved Uzbek optimization by adapting the
meta-llama/Llama-3.2 base model through an in-place vocabulary replacement strategy. Less relevant non-ASCII tokens were replaced with custom Uzbek tokens, preserving the original architecture and total vocabulary size. New Uzbek token embeddings were initialized via subtoken averaging.
Performance Highlights
Benchmarks demonstrate that alloma-1B-Instruct consistently outperforms its base Llama-3.2 1B Instruct counterpart in Uzbek-English and English-Uzbek translation tasks (BLEU and COMET scores) and Uzbek sentiment analysis. While showing strong performance in Uzbek-specific tasks, there is a slight decline in MMLU (English) and Uzbek News Classification due to catastrophic forgetting of original English instruction following.
Ideal Use Cases
- Applications requiring efficient processing and generation of Uzbek text.
- Translation services between Uzbek and English.
- Sentiment analysis for Uzbek content.
- Deployment on devices with limited computational resources.