Model Overview
The nazimali/Mistral-Nemo-Kurdish-Instruct is a 12 billion parameter language model, fine-tuned from the nazimali/Mistral-Nemo-Kurdish base model. Its primary focus is on generating instruction-following responses in Kurdish (Kurmanji). The model was trained using a single, comprehensive Kurdish Kurmanji instruction dataset, comprising 41,559 filtered rows from saillab/alpaca-kurdish_kurmanji-cleaned.
Key Capabilities
- Kurdish (Kurmanji) Instruction Following: Specialized in understanding and generating text based on instructions provided in the Kurmanji dialect.
- Large Context Window: Supports a context length of 32768 tokens, allowing for processing longer inputs and generating more extensive responses.
- Accessibility: Provides examples for integration with
llama-cpp-python,llama.cpp, and Hugging Face Transformers, making it accessible for various development environments.
Training Details
The model was fine-tuned using Transformers 4.44.2 on a single NVIDIA A40 GPU, with a training duration of approximately 7 hours and 41 minutes. The training process achieved a loss of 0.7774 and a total FLOPs of 2.225e+18. The instruction format used for fine-tuning ensures the model can effectively process and respond to structured prompts.
Future Development
The developer intends to explore multi-GPU training setups to reduce training times and plans to expand the model's capabilities to include both Kurdish Kurmanji (Latin script) and Kurdish Sorani (Arabic script) in future iterations.