McGill-NLP/AfriqueGemma-12B
McGill-NLP/AfriqueGemma-12B is a 12 billion parameter causal language model, part of the AfriqueLLM suite, developed by McGill-NLP. It is based on Google's Gemma 3 12B PT and has been continuously pre-trained on approximately 26 billion tokens to adapt it for 20 African languages, while maintaining strong performance in high-resource languages. This model excels in multilingual contexts, particularly for African languages, and includes training data for reasoning and mathematical capabilities.
Loading preview...
Model Overview
McGill-NLP/AfriqueGemma-12B is a 12 billion parameter causal language model from the AfriqueLLM suite, developed by McGill-NLP. It builds upon Google's Gemma 3 12B PT and has undergone continued pre-training (CPT) on approximately 26 billion tokens of carefully curated multilingual data. This adaptation significantly enhances its performance across 20 African languages, alongside maintaining strong capabilities in high-resource languages like English, French, Portuguese, and Arabic.
Key Capabilities
- Multilingual Proficiency: Adapted for 20 African languages (e.g., Swahili, Hausa, Yoruba, Amharic) and 4 high-resource languages.
- Robust Base: Utilizes the Gemma 3 12B PT architecture with an 8,192-token native context length.
- Enhanced Reasoning: Training data includes ~1B tokens from CornStack-Python for code reasoning and ~1B tokens from FineMath-4+ for mathematical understanding.
- Performance Improvement: Demonstrates a +4.0 (7.3%) overall improvement on a suite of African multilingual benchmarks (AfriMGSM, AfriMMLU, AfriXNLI, Belebele, FLORES, INJONG, SIB-200) compared to its base model, Gemma3-12B.
Good for
- Applications requiring strong language understanding and generation in a wide array of African languages.
- Tasks benefiting from improved reasoning and mathematical capabilities in a multilingual context.
- Developers looking for a robust base model specifically optimized for African linguistic diversity.