McGill-NLP/AfriqueGemma-4B

VISIONConcurrency Cost:1Model Size:4.3BQuant:BF16Ctx Length:32kPublished:Jan 6, 2026License:cc-by-4.0Architecture:Transformer Open Weights Cold

AfriqueGemma-4B is a 4 billion parameter causal language model developed by McGill-NLP, adapted from Google's Gemma 3 4B PT. It is specifically designed for improved performance across 20 African languages through continued pre-training on approximately 26 billion tokens of multilingual data. This model maintains strong capabilities in high-resource languages while excelling in African language understanding and generation, making it suitable for multilingual applications targeting the African continent.

Loading preview...

Model Overview

AfriqueGemma-4B, part of the AfriqueLLM suite by McGill-NLP, is a 4 billion parameter causal language model based on google/gemma-3-4b-pt. It has been specifically adapted for 20 African languages through continued pre-training on approximately 26 billion tokens of curated multilingual data, while also retaining strong performance in high-resource languages like English and French.

Key Capabilities

  • Multilingual Proficiency: Enhanced performance across 20 African languages (e.g., Swahili, Hausa, Yoruba, Amharic) due to targeted continued pre-training.
  • Robust Base: Built on the Gemma 3 4B PT architecture, ensuring a solid foundation for language understanding.
  • Extended Context: Features a native context length of 8,192 tokens.
  • Specialized Training Data: Training corpus includes African monolingual data (FineWeb2, WURA, MADLAD-400), code (CornStack-Python), mathematics (FineMath-4+), and GPT-4.1 translated synthetic data.

Good For

  • Applications requiring strong language model capabilities in African languages.
  • Research and development focused on low-resource language NLP.
  • Tasks benefiting from a 4B parameter model with a focus on multilingual adaptation and efficiency.