AfriqueQwen-14B-multiturn Overview

This model is a specialized 14 billion parameter language model, fine-tuned by McGill-NLP from its base model, McGill-NLP/AfriqueQwen-14B. It leverages a substantial 32768 token context window, making it suitable for handling longer conversational exchanges.

Key Capabilities

Multi-turn Dialogue: Specifically optimized for engaging in extended, multi-turn conversations.
African Language Focus: Fine-tuned on the afri_multiturn dataset, indicating a focus on African language processing for conversational AI.

Training Details

The model was trained with a learning rate of 1e-05, using a cosine learning rate scheduler with 0.1 warmup steps over 5 epochs. It utilized an AdamW optimizer with specific beta and epsilon parameters, distributed across 4 GPUs with a total batch size of 8. The training environment included Transformers 5.2.0 and PyTorch 2.10.0+cu128.

Good For

Developing conversational AI agents for African language contexts.
Applications requiring models capable of understanding and generating long, multi-turn dialogues.
Research into large language models adapted for specific linguistic and cultural datasets.

Overview

AfriqueQwen-14B-multiturn Overview

Key Capabilities

Training Details

Good For

Full Model Card (README)