McGill-NLP/AfriqueQwen-14B
Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:14BQuant:FP8Ctx Length:32kPublished:Jan 7, 2026License:cc-by-4.0Architecture:Transformer0.0K Open Weights Warm

McGill-NLP/AfriqueQwen-14B is a 14 billion parameter language model from the AfriqueLLM suite, adapted from Qwen3-14B-Base. It has been continuously pre-trained on 27.5 billion tokens, specifically optimized for 20 African languages while maintaining strong performance in high-resource languages. This model excels in multilingual contexts, particularly for tasks involving African languages, and supports a 32,768-token context length. Its primary strength lies in its significantly improved performance on African language benchmarks compared to its base model and other alternatives.

Loading preview...

AfriqueQwen-14B: African Language Optimized LLM

AfriqueQwen-14B is the flagship model of the AfriqueLLM suite, developed by McGill-NLP. This 14 billion parameter model is built upon the Qwen3-14B-Base architecture and has undergone extensive continued pre-training (CPT) on 27.5 billion tokens of multilingual data, with a strong focus on African languages.

Key Capabilities

  • Multilingual Adaptation: Specifically adapted for 20 African languages (e.g., Swahili, Hausa, Yoruba, Amharic) while preserving strong performance in high-resource languages like English, French, Portuguese, and Arabic.
  • Robust Base Model: Utilizes the Qwen 3 14B Base, which demonstrated superior performance preservation and strong results on long-context tasks, such as document-level translation, during CPT.
  • Extended Context Length: Supports a native context length of 32,768 tokens, enabling processing of longer texts.
  • Comprehensive Training Data: Trained on a diverse corpus including 22.8B tokens of African monolingual data, 1B tokens of code (CornStack-Python), 1B tokens of mathematics (FineMath-4+), and 324M tokens of GPT-4.1 translated synthetic data.
  • Benchmark Performance: Achieves significant improvements on African language benchmarks, with a +23.9% (60.0%) overall gain compared to its base model, outperforming other AfriqueLLM variants and base models like Llama3.1-8B on metrics such as AfriMGSM, AfriMMLU, and Belebele.

Good For

  • Applications requiring strong language understanding and generation in African languages.
  • Tasks benefiting from long-context processing in multilingual environments.
  • Developers seeking a robust base model with enhanced performance for low-resource languages without significant degradation in high-resource language capabilities.