McGill-NLP/AfriqueGemma-12B

VISIONConcurrency Cost:1Model Size:12BQuant:FP8Ctx Length:32kPublished:Jan 6, 2026License:cc-by-4.0Architecture:Transformer0.0K Open Weights Cold

McGill-NLP/AfriqueGemma-12B is a 12 billion parameter causal language model from the AfriqueLLM suite, based on Google's Gemma 3 12B PT. It has been adapted through continued pre-training on approximately 26 billion tokens to enhance performance across 20 African languages, while maintaining strong capabilities in high-resource languages. With an 8,192 token context length, this model is optimized for multilingual applications, particularly excelling in African language understanding and generation.

Loading preview...

Model Overview

McGill-NLP/AfriqueGemma-12B is a 12 billion parameter causal language model, part of the AfriqueLLM suite developed by McGill-NLP. It is built upon Google's Gemma 3 12B PT and has undergone continued pre-training (CPT) on approximately 26 billion tokens of carefully curated multilingual data. This adaptation specifically targets improved performance across 20 African languages, alongside maintaining proficiency in high-resource languages like English, French, Portuguese, and Arabic.

Key Capabilities

  • Multilingual Proficiency: Adapted for 20 African languages including Swahili, Amharic, Hausa, Yoruba, and Zulu, among others.
  • Robust Base Model: Leverages the strong foundation of Gemma 3 12B PT.
  • Extended Context: Features a native context length of 8,192 tokens.
  • Specialized Training Data: Trained on a diverse corpus including African monolingual data (22.8B tokens), code (1B tokens), mathematics (1B tokens), and synthetic data (324M tokens) to enhance reasoning and mathematical understanding.

Performance Highlights

AfriqueGemma-12B demonstrates significant improvements over its base model, Gemma3-12B, across various African language benchmarks. It achieves a 58.82 overall score, marking a +4.0 (7.3%) increase. Notably, it shows strong gains in FLORES (eng->xxx) with 65.04, compared to Gemma3-12B's 44.09, indicating enhanced translation capabilities for African languages.

Use Cases

This model is ideal for applications requiring strong language understanding and generation in a wide array of African languages, as well as for tasks benefiting from its general reasoning and mathematical capabilities. It is suitable for developers building multilingual AI solutions targeting African linguistic diversity.