McGill-NLP/AfriqueGemma-4B

VISIONConcurrency Cost:1Model Size:4.3BQuant:BF16Ctx Length:32kPublished:Jan 6, 2026License:cc-by-4.0Architecture:Transformer Open Weights Cold

McGill-NLP/AfriqueGemma-4B is a 4 billion parameter causal language model, part of the AfriqueLLM suite, developed by McGill-NLP. It is based on Google's Gemma 3 4B PT and has been continuously pre-trained on approximately 26 billion tokens, specifically adapted for 20 African languages while maintaining performance in high-resource languages. With an 8,192 token context length, this model is optimized for multilingual applications, particularly excelling in African language understanding and generation.

Loading preview...

Model Overview

McGill-NLP/AfriqueGemma-4B is a 4 billion parameter causal language model from the AfriqueLLM suite, developed by McGill-NLP. It is built upon the google/gemma-3-4b-pt base model and has undergone continued pre-training (CPT) on approximately 26 billion tokens of carefully curated multilingual data. This adaptation significantly enhances its performance across 20 African languages while preserving strong capabilities in high-resource languages like English, French, Portuguese, and Arabic.

Key Capabilities

  • Multilingual Proficiency: Specifically adapted for 20 African languages, including Swahili, Hausa, Yoruba, Amharic, and Zulu, alongside major global languages.
  • Robust Base: Leverages the strong foundation of the Gemma 3 4B PT model.
  • Extended Context: Features an 8,192 token native context length, suitable for processing longer texts.
  • Specialized Training Data: Training corpus includes African monolingual data (22.8B tokens), code (1B tokens), mathematics (~1B tokens), and synthetic data, balanced using UniMax sampling.

Evaluation Highlights

AfriqueGemma-4B demonstrates notable improvements over its base model, Gemma3-4B, across various multilingual benchmarks. It shows a +7.6 (18.8%) overall improvement on the AfriqueLLM evaluation suite, which includes benchmarks like AfriMGSM, AfriMMLU, AfriXNLI, and FLORES. This indicates enhanced understanding and generation capabilities in its target languages.

Good For

  • Applications requiring strong language understanding and generation in a wide array of African languages.
  • Developers building multilingual LLM solutions targeting African linguistic contexts.
  • Research and development in low-resource language NLP.