Alfaxad/gemma2-9b-swahili-it
Alfaxad/gemma2-9b-swahili-it is a 9 billion parameter decoder-only transformer model, fine-tuned by Alfaxad Eyembe from Google's Gemma2-9B-IT base model. Optimized for natural Swahili language understanding and generation, it demonstrates improved performance on Swahili MMLU and sentiment analysis tasks. This model excels at instruction following, text generation, and question answering specifically in Swahili, making it suitable for applications requiring robust Swahili language capabilities.
Loading preview...
Overview
Alfaxad/gemma2-9b-swahili-it is a 9 billion parameter decoder-only transformer model developed by Alfaxad Eyembe. It is a fine-tuned variant of Google's Gemma2-9B-IT, specifically optimized for the Swahili language. The model was fine-tuned using Low-Rank Adaptation (LoRA) on a comprehensive dataset of 67,017 Swahili instruction-response pairs, totaling over 16 million tokens.
Key Capabilities
- Enhanced Swahili Performance: Shows significant improvement over its base model, with a +7.02% increase in Swahili MMLU accuracy (from 45.61% to 52.63%) and a +1.15% increase in Swahili sentiment analysis accuracy (from 84.85% to 86.00%).
- Natural Language Generation: Capable of generating natural and coherent Swahili text.
- Instruction Following: Designed to follow general instructions effectively in Swahili.
- Efficient Fine-tuning: Utilizes LoRA for efficient parameter updates, trained over 400 steps with a learning rate of 2e-4.
Intended Use Cases
This model is particularly well-suited for applications requiring strong Swahili language processing:
- Natural Swahili text generation and creative writing.
- Question answering systems in Swahili.
- Sentiment analysis of Swahili content.
- General instruction following for Swahili-speaking users.