McGill-NLP/AfriqueQwen-8B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Jan 7, 2026License:cc-by-4.0Architecture:Transformer0.0K Open Weights Warm

McGill-NLP/AfriqueQwen-8B is an 8 billion parameter causal language model, part of the AfriqueLLM suite, developed by McGill-NLP. It is based on Qwen3-8B-Base and has been continuously pre-trained on approximately 26 billion tokens, specifically adapted for 20 African languages while maintaining strong performance in high-resource languages. This model features a 32,768 token context length and excels in multilingual tasks, particularly for African language understanding and generation.

Loading preview...

Model Overview

McGill-NLP/AfriqueQwen-8B is an 8 billion parameter causal language model developed by McGill-NLP, built upon the Qwen3-8B-Base architecture. It is a key component of the AfriqueLLM suite, which focuses on adapting large language models for African languages. The model underwent continued pre-training (CPT) on approximately 26 billion tokens of carefully curated multilingual data, including African monolingual data, code, mathematics, and synthetic data.

Key Capabilities

  • Multilingual Adaptation: Specifically adapted for 20 African languages (e.g., Afrikaans, Swahili, Amharic, Hausa, Yoruba, Zulu) through continued pre-training, significantly improving performance on these low-resource languages.
  • Base Model Performance Preservation: Maintains strong capabilities in high-resource languages (English, French, Portuguese, Arabic) due to the robust Qwen 3 base and catastrophic forgetting mitigation strategies.
  • Long Context Handling: Features a native context length of 32,768 tokens, making it suitable for tasks requiring extensive contextual understanding, such as document-level translation.
  • Robust Training: Benefits from a diverse training corpus, including FineWeb2, WURA, MADLAD-400 for African languages, CornStack-Python for code, and FineMath-4+ for mathematical reasoning.

Good For

  • African Language Applications: Ideal for developers building applications targeting the 20 supported African languages, including text generation, translation, and understanding.
  • Multilingual Research: Useful for researchers studying language model adaptation, continued pre-training, and performance in low-resource linguistic contexts.
  • Long-form Content Processing: Its large context window makes it suitable for tasks involving lengthy documents or conversations in supported languages.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p