McGill-NLP/AfriqueGemma-4B
AfriqueGemma-4B is a 4 billion parameter causal language model developed by McGill-NLP, adapted from Google's Gemma 3 4B PT. It is specifically designed for improved performance across 20 African languages through continued pre-training on approximately 26 billion tokens of multilingual data. This model maintains strong capabilities in high-resource languages while excelling in African language understanding and generation, making it suitable for multilingual applications targeting the African continent.
Loading preview...
Model Overview
AfriqueGemma-4B, part of the AfriqueLLM suite by McGill-NLP, is a 4 billion parameter causal language model based on google/gemma-3-4b-pt. It has been specifically adapted for 20 African languages through continued pre-training on approximately 26 billion tokens of curated multilingual data, while also retaining strong performance in high-resource languages like English and French.
Key Capabilities
- Multilingual Proficiency: Enhanced performance across 20 African languages (e.g., Swahili, Hausa, Yoruba, Amharic) due to targeted continued pre-training.
- Robust Base: Built on the Gemma 3 4B PT architecture, ensuring a solid foundation for language understanding.
- Extended Context: Features a native context length of 8,192 tokens.
- Specialized Training Data: Training corpus includes African monolingual data (FineWeb2, WURA, MADLAD-400), code (CornStack-Python), mathematics (FineMath-4+), and GPT-4.1 translated synthetic data.
Good For
- Applications requiring strong language model capabilities in African languages.
- Research and development focused on low-resource language NLP.
- Tasks benefiting from a 4B parameter model with a focus on multilingual adaptation and efficiency.