Name: inspirebek/qwen3-4b-uzbek-v2 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: inspirebek

Overview

inspirebek/qwen3-4b-uzbek-v2 is a 4 billion parameter Qwen3-based language model, specifically fine-tuned for the Uzbek language. This model addresses the challenge of adapting English-dominant base models to new languages by expanding the LoRA configuration to include embed_tokens and lm_head, which are crucial for re-mapping the vocabulary. This approach significantly improved performance on Uzbek benchmarks, with MMLU-uz jumping to 40.50% from a near-random baseline.

Key Capabilities

Uzbek Language Proficiency: Achieves 40.50% on MMLU-uz and 33.42% on UzLib, demonstrating strong understanding and generation in Uzbek.
Dual-Stage Fine-tuning: Utilizes a two-stage LoRA fine-tuning process, including continued pretraining on native Uzbek text and supervised fine-tuning on chat-formatted Uzbek instructions.
Efficient Training: Employs unsloth and peft with specific LoRA configurations (r=64, alpha=128, use_rslora=True) and a dual learning rate strategy to optimize training within compute constraints.
Robustness: Features a TrainerCallback for pushing checkpoints to Hugging Face, enabling seamless resumption of training after compute timeouts.

Intended Use Cases

Uzbek-first Chat Assistants: Designed primarily for conversational AI applications in Uzbek.
Multilingual Applications: Capable in English as well, making it suitable for scenarios requiring both Uzbek and English language support.
Research and Development: Serves as a research artifact for exploring language model adaptation to low-resource languages. Users should be aware of the CC-BY-NC-4.0 license on some training data, which restricts commercial use of derivative models unless those subsets are excluded.

Overview

Overview

Key Capabilities

Intended Use Cases

Full Model Card (README)