Overview
mwitiderrick/SwahiliInstruct-v0.2 is a 7 billion parameter language model built upon the robust Mistral-7B-Instruct-v0.2 architecture. This model has undergone specialized fine-tuning for three epochs using the Swahili Alpaca dataset, making it particularly adept at processing and generating content in Swahili. Its primary strength lies in its ability to understand and respond to instructions in Swahili, leveraging a 4096-token context window.
Key Capabilities
- Swahili Instruction Following: Excels at understanding and executing instructions provided in Swahili.
- Swahili Text Generation: Capable of generating coherent and contextually relevant text in Swahili.
- Mistral-7B Base: Benefits from the strong foundational capabilities of the Mistral-7B-Instruct-v0.2 model.
Performance Metrics
Evaluated on the Open LLM Leaderboard, SwahiliInstruct-v0.2 demonstrates a balanced performance across various benchmarks, with an average score of 54.25. Notable scores include 78.22 on HellaSwag (10-Shot) and 73.24 on Winogrande (5-shot), indicating strong common sense reasoning. Detailed evaluation results are available on the Hugging Face Open LLM Leaderboard.
Usage
Developers can easily integrate this model using the Hugging Face transformers library, with a straightforward prompt template for instruction-based queries. The model is designed for direct loading and inference, supporting tasks like text generation with specified max_length and repetition_penalty.