junaid008/qehwa-pashto-llm

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Mar 13, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

Qehwa is the first Pashto large language model, developed by Junaid Aslam, built on Qwen2.5-7B. It is specifically instruction-tuned for the Peshawari dialect of Pakistani Pashto, undergoing continued pre-training on 3.4 million Pashto documents and supervised fine-tuning on 126,519 instruction-response pairs. This model excels at generating natural Pashto conversation, creative writing, and translating English and Urdu instructions into Pashto, making it ideal for Pashto-specific NLP tasks.

Loading preview...

Qehwa: The First Pashto LLM

Qehwa, developed by Junaid Aslam, is the first dedicated Pakistani Pashto large language model, specifically targeting the Peshawari/KPK dialect. Built on Qwen2.5-7B, it underwent a two-stage training process:

  • Continued Pre-Training (CPT): On 3.4 million clean Pakistani Pashto documents.
  • Supervised Fine-Tuning (SFT): On 126,519 high-quality Peshawari Pashto instruction-response pairs.

Key Capabilities

  • Generates responses in pure Peshawari Pashto.
  • Responds to English and Urdu instructions in Pashto.
  • Facilitates natural Pashto conversation and creative writing.
  • Provides information on Islamic topics, KPK history, culture, geography, and Pashtunwali traditions.
  • Offers Pashto grammar correction and English to Pashto translation.

Evaluation & Performance

Qehwa was evaluated on a custom benchmark of 150 tests across 15 categories, achieving an overall average accuracy of 85.3%. It demonstrated strong performance in:

  • English → Pashto translation: 90%
  • Urdu → Pashto translation: 84%
  • Health & Daily Life in Pashto: 90%

This model is a significant resource for the 60+ million Pashto speakers, particularly those using the Peshawari dialect, and is released under the Apache 2.0 License.