AfriqueGemma-12B: Multilingual Adaptation for African Languages
AfriqueGemma-12B is a 12 billion parameter causal language model developed by McGill-NLP as part of the AfriqueLLM suite. It is built upon Google's Gemma 3 12B PT and has undergone extensive continued pre-training (CPT) on 25.2 billion tokens, specifically curated to enhance its performance across 20 African languages, alongside maintaining proficiency in high-resource languages like English, French, Portuguese, and Arabic.
Key Capabilities
- Multilingual Proficiency: Adapted for 20 African languages including Swahili, Hausa, Yoruba, Zulu, and Amharic, mitigating catastrophic forgetting for high-resource languages.
- Robust Training: Trained on a diverse corpus including 22.8B tokens of African monolingual data, 1B tokens of code (CornStack-Python), 1B tokens of mathematics (FineMath-4+), and 324M tokens of GPT-4.1 translated synthetic data.
- Performance Improvement: Demonstrates significant gains on multilingual benchmarks compared to its base model, with a +4.8 overall improvement (8.9%) on the AfriqueLLM evaluation suite.
- Efficient Inference: Supports deployment with
vLLM and sglang for optimized serving.
Good for
- Applications requiring strong language understanding and generation in a wide array of African languages.
- Research and development in low-resource language NLP.
- Tasks benefiting from a model with enhanced reasoning and mathematical capabilities due to specialized training data.