Khurram123/Shaheen-Gemma4-Urdu

VISIONConcurrency Cost:1Model Size:5.1BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:Apr 3, 2026Architecture:Transformer0.0K Cold

Shaheen-Gemma4-Urdu is a 5.1 billion parameter Urdu language model developed by Khurram Pervez (Khurram123), fine-tuned on 51,686 high-quality Urdu instruction samples. Based on the Gemma 4 (2B) architecture, it provides deep linguistic understanding, formal vocabulary, and cultural nuance in Urdu. This model excels at handling complex Urdu grammar and literature, making it suitable for applications requiring high-fidelity Urdu text generation and comprehension.

Loading preview...

Overview

Shaheen-Gemma4-Urdu is a 5.1 billion parameter Urdu language model developed by Khurram Pervez (Khurram123). It is specifically fine-tuned on 51,686 high-quality Urdu instruction samples to achieve deep linguistic understanding, formal vocabulary, and cultural nuance in the Urdu language. The model is built upon the state-of-the-art Gemma 4 (2B) architecture and is available in both 16-bit Safetensors and Quantized GGUF formats.

Key Capabilities

  • Exceptional Urdu Fluency: Tuned to handle complex Urdu grammar and formal literature with high precision.
  • Efficient Performance: Delivers approximately 94 tokens per second on an NVIDIA RTX 4060 Ti, offering fast inference.
  • Dual Format Availability: Provided as model.safetensors for transformers and Shaheen-Gemma4-Urdu-Q4_K_M.gguf for llama.cpp integration.
  • Strong Generalization: Achieved a final loss of 1.118 after approximately 2 hours of training (1 full epoch) on the large-traversaal/urdu-instruct dataset.

Good For

  • Applications requiring high-fidelity Urdu text generation.
  • Tasks involving complex Urdu grammar and formal literary understanding.
  • Developers needing an efficient Urdu-specific LLM for deployment on-device or GPU-accelerated inference.