israel/AfriqueQwen-14B-multiturn_1
AfriqueQwen-14B-multiturn_1 is a 14 billion parameter language model fine-tuned by israel based on McGill-NLP's AfriqueQwen-14B. This model is specifically optimized for multi-turn conversational tasks, leveraging the afri_multiturn_1 dataset. With a context length of 32768 tokens, it aims to enhance dialogue capabilities in African languages.
Loading preview...
Model Overview
AfriqueQwen-14B-multiturn_1 is a 14 billion parameter language model developed by israel, building upon the McGill-NLP/AfriqueQwen-14B base model. It has been fine-tuned on the afri_multiturn_1 dataset, indicating a specialization in multi-turn conversational AI, particularly within the context of African languages.
Key Capabilities
- Multi-turn Dialogue: Optimized for engaging in extended, coherent conversations.
- African Language Focus: Derived from a base model designed for African linguistic contexts.
- Large Context Window: Supports a substantial context length of 32768 tokens, allowing for detailed and long-form interactions.
Training Details
The model was trained with a learning rate of 1e-05, using a total batch size of 8 across 4 GPUs. The training procedure involved 5 epochs with a cosine learning rate scheduler and AdamW optimizer. This fine-tuning process aims to adapt the base model's general language understanding to specific multi-turn dialogue patterns found in the afri_multiturn_1 dataset.
Intended Use Cases
This model is suitable for applications requiring robust multi-turn conversational abilities, especially in environments where African language support is beneficial. Potential uses include chatbots, virtual assistants, and interactive dialogue systems.