uzlm/alloma-3B-Instruct

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.2BQuant:BF16Ctx Length:32kPublished:Sep 3, 2025License:llama3.2Architecture:Transformer0.0K Warm

The uzlm/alloma-3B-Instruct is a 3.2 billion parameter instruction-tuned causal language model developed by Examy.me and Teamwork.uz, based on the Llama-3.2 architecture. It features a customized tokenizer that significantly improves efficiency for Uzbek language processing, enabling 2x faster inference and longer effective context for Uzbek text. This model is optimized for Uzbek language tasks, outperforming base Llama models in translation and sentiment analysis, and can run efficiently on devices with limited VRAM.

Loading preview...

uzlm/alloma-3B-Instruct: Uzbek-Optimized Llama-3.2 Model

The alloma-3B-Instruct is a 3.2 billion parameter instruction-tuned model, part of a series developed by Examy.me and Teamwork.uz, building upon the Llama-3.2 base architecture. Its core innovation lies in a customized tokenizer specifically designed for the Uzbek language. This tokenizer processes Uzbek text with approximately 1.7 tokens per word, a significant improvement over the original Llama models' ~3.5 tokens per word, effectively doubling inference speed and extending the practical context length for Uzbek content.

Key Capabilities & Features

  • Uzbek Language Optimization: Achieves superior performance in Uzbek-English and English-Uzbek translation (BLEU and COMET scores) and Uzbek sentiment analysis compared to its base Llama counterparts.
  • Efficient Tokenization: Utilizes an in-place vocabulary adaptation strategy, replacing less relevant non-ASCII tokens in the original Llama vocabulary with custom Uzbek tokens, without altering the model's architecture.
  • Resource-Efficient: Can be run on devices with as little as 2 GB of VRAM (with quantization), making it suitable for small GPUs, edge devices, and mobile applications.
  • Continual Pretraining: The model underwent continual pretraining on a bilingual dataset (75% English, 25% Uzbek) with a context length of 2048 tokens, followed by SFT fine-tuning.

Performance Highlights

Benchmarks demonstrate alloma-3B-Instruct's strong performance in Uzbek-specific tasks:

  • BLEU Uz→En: 25.19 (vs. Llama-3.2 3B Instruct's 11.91)
  • BLEU En→Uz: 14.66 (vs. Llama-3.2 3B Instruct's 2.54)
  • COMET Uz→En: 85.08 (vs. Llama-3.2 3B Instruct's 71.96)
  • Uzbek Sentiment Analysis: 81.64 (vs. Llama-3.2 3B Instruct's 56.01)

While excelling in Uzbek tasks, the model shows a slight decline in English MMLU and Uzbek News Classification compared to the base Llama model, attributed to catastrophic forgetting of original English instruction following during optimization for Uzbek.

Use Cases

This model is ideal for applications requiring robust and efficient processing of the Uzbek language, including:

  • Machine translation between Uzbek and English.
  • Sentiment analysis of Uzbek text.
  • Language generation and understanding in Uzbek contexts.
  • Deployment on resource-constrained environments where Uzbek language support is critical.