Model Overview

AfriqueQwen-14B-multiturn_2 is a 14 billion parameter language model, fine-tuned from the base model McGill-NLP/AfriqueQwen-14B. This iteration has been specifically adapted for multi-turn conversational applications through fine-tuning on the afri_multiturn_2 dataset.

Key Characteristics

Base Model: Derived from McGill-NLP/AfriqueQwen-14B.
Parameter Count: 14 billion parameters, offering a balance between performance and computational requirements.
Context Length: Supports a substantial context window of 32,768 tokens, enabling the model to handle longer and more complex multi-turn dialogues.
Fine-tuning Focus: Optimized for multi-turn interactions, indicating enhanced coherence and context retention across conversational exchanges.

Training Details

The model was trained with a learning rate of 1e-05 over 5 epochs, utilizing a distributed setup across 4 GPUs. Key hyperparameters included a total batch size of 8 (with gradient accumulation steps of 2) and the AdamW_TORCH_FUSED optimizer. The training process employed a cosine learning rate scheduler with a 0.1 warmup ratio.

Potential Use Cases

Multi-turn Dialogue Systems: Ideal for chatbots, virtual assistants, and conversational AI applications requiring sustained context.
African Language Processing: Given its fine-tuning on the afri_multiturn_2 dataset, it may offer specialized capabilities for African language-specific conversational tasks, though further details on the dataset's linguistic scope are needed.

Limitations

The model card indicates that more information is needed regarding its specific intended uses, limitations, and the detailed nature of the training and evaluation data. Users should exercise caution and conduct thorough evaluations for specific applications.

Overview

Model Overview

Key Characteristics

Training Details

Potential Use Cases

Limitations

Full Model Card (README)