McGill-NLP/AfriqueQwen-14B is a 14 billion parameter language model from the AfriqueLLM suite, adapted from Qwen3-14B-Base. It has been continuously pre-trained on 27.5 billion tokens, specifically optimized for 20 African languages while maintaining strong performance in high-resource languages. This model excels in multilingual contexts, particularly for tasks involving African languages, and supports a 32,768-token context length. Its primary strength lies in its significantly improved performance on African language benchmarks compared to its base model and other alternatives.
Loading preview...
AfriqueQwen-14B: African Language Optimized LLM
AfriqueQwen-14B is the flagship model of the AfriqueLLM suite, developed by McGill-NLP. This 14 billion parameter model is built upon the Qwen3-14B-Base architecture and has undergone extensive continued pre-training (CPT) on 27.5 billion tokens of multilingual data, with a strong focus on African languages.
Key Capabilities
- Multilingual Adaptation: Specifically adapted for 20 African languages (e.g., Swahili, Hausa, Yoruba, Amharic) while preserving strong performance in high-resource languages like English, French, Portuguese, and Arabic.
- Robust Base Model: Utilizes the Qwen 3 14B Base, which demonstrated superior performance preservation and strong results on long-context tasks, such as document-level translation, during CPT.
- Extended Context Length: Supports a native context length of 32,768 tokens, enabling processing of longer texts.
- Comprehensive Training Data: Trained on a diverse corpus including 22.8B tokens of African monolingual data, 1B tokens of code (CornStack-Python), 1B tokens of mathematics (FineMath-4+), and 324M tokens of GPT-4.1 translated synthetic data.
- Benchmark Performance: Achieves significant improvements on African language benchmarks, with a +23.9% (60.0%) overall gain compared to its base model, outperforming other AfriqueLLM variants and base models like Llama3.1-8B on metrics such as AfriMGSM, AfriMMLU, and Belebele.
Good For
- Applications requiring strong language understanding and generation in African languages.
- Tasks benefiting from long-context processing in multilingual environments.
- Developers seeking a robust base model with enhanced performance for low-resource languages without significant degradation in high-resource language capabilities.