behbudiy/Mistral-7B-Instruct-Uz
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Aug 30, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

The behbudiy/Mistral-7B-Instruct-Uz is a 7 billion parameter instruction-tuned language model, continually pre-trained and fine-tuned by Eldor Fozilov, Azimjon Urinov, and Khurshid Juraev. It is optimized for natural language processing tasks in Uzbek, including machine translation, summarization, and dialogue systems. This model significantly enhances performance on Uzbek language benchmarks while maintaining English language understanding.

Loading preview...

behbudiy/Mistral-7B-Instruct-Uz: Uzbek-Optimized Mistral 7B

This model is a 7 billion parameter instruction-tuned variant of Mistral 7B, developed by Eldor Fozilov, Azimjon Urinov, and Khurshid Juraev. It has undergone continual pre-training and instruction-tuning using a blend of public and synthetic Uzbek and English data. The primary goal is to enhance its capabilities for Uzbek language tasks while preserving its original English knowledge.

Key Capabilities & Differentiators

  • Superior Uzbek Language Performance: Benchmarks show significant improvements over base models in Uzbek-English and English-Uzbek machine translation (BLEU and COMET scores), Uzbek sentiment analysis, and Uzbek news classification.
  • Multilingual Proficiency: While optimized for Uzbek, the model demonstrates robust performance on English tasks, with MMLU scores showing no significant decline compared to its base counterpart.
  • Instruction Following: Designed to support various NLP tasks in Uzbek, such as machine translation, summarization, and dialogue systems.

Performance Highlights

Comparative evaluations against Llama-3.1 8B Instruct and Mistral 7B Instruct models demonstrate that Mistral 7B Instruct Uz consistently outperforms its base version across key Uzbek language metrics. For instance, it achieves 29.39 BLEU Uz-En and 16.77 BLEU En-Uz, significantly higher than the base Mistral 7B Instruct's 7.47 and 0.67 respectively. Its MMLU (English) score is 55.72, indicating strong general language understanding.

Use Cases

This model is ideal for developers and researchers focusing on applications requiring high-quality Uzbek language processing, including:

  • Machine translation between Uzbek and English.
  • Uzbek text summarization and dialogue systems.
  • Sentiment analysis and classification of Uzbek content.