nazimali/Mistral-Nemo-Kurdish-Instruct

Cold
Public
12B
FP8
32768
License: apache-2.0
Hugging Face
Overview

Model Overview

The nazimali/Mistral-Nemo-Kurdish-Instruct is a 12 billion parameter language model, fine-tuned from the nazimali/Mistral-Nemo-Kurdish base model. Its primary focus is on generating instruction-following responses in Kurdish (Kurmanji). The model was trained using a single, comprehensive Kurdish Kurmanji instruction dataset, comprising 41,559 filtered rows from saillab/alpaca-kurdish_kurmanji-cleaned.

Key Capabilities

  • Kurdish (Kurmanji) Instruction Following: Specialized in understanding and generating text based on instructions provided in the Kurmanji dialect.
  • Large Context Window: Supports a context length of 32768 tokens, allowing for processing longer inputs and generating more extensive responses.
  • Accessibility: Provides examples for integration with llama-cpp-python, llama.cpp, and Hugging Face Transformers, making it accessible for various development environments.

Training Details

The model was fine-tuned using Transformers 4.44.2 on a single NVIDIA A40 GPU, with a training duration of approximately 7 hours and 41 minutes. The training process achieved a loss of 0.7774 and a total FLOPs of 2.225e+18. The instruction format used for fine-tuning ensures the model can effectively process and respond to structured prompts.

Future Development

The developer intends to explore multi-GPU training setups to reduce training times and plans to expand the model's capabilities to include both Kurdish Kurmanji (Latin script) and Kurdish Sorani (Arabic script) in future iterations.