Model Overview
Gemma2-2B-Swahili-IT is a 2.6 billion parameter, decoder-only transformer model developed by Alfaxad Eyembe. It is a fine-tuned variant of Google's Gemma2-2B-IT, specifically optimized for the Swahili language. The model was fine-tuned using Low-Rank Adaptation (LoRA) on a comprehensive dataset of over 67,000 instruction-response pairs, totaling more than 16 million tokens of high-quality, naturally-written Swahili content.
Key Capabilities
- Swahili Language Proficiency: Designed for natural Swahili understanding and generation.
- Instruction Following: Capable of general instruction following in Swahili.
- Text Generation: Performs basic Swahili text generation and simple creative writing.
- Question Answering: Supports question answering tasks in Swahili.
- Sentiment Analysis: Achieves 86.00% accuracy on sentiment analysis, an improvement over the base model.
- Efficiency: Lightweight with 2.6 billion parameters, offering lower memory requirements and faster inference.
Performance Highlights
The fine-tuned model shows notable improvements over its base model:
- Swahili MMLU: Increased accuracy from 31.58% to 38.60% (+7.02%).
- Sentiment Analysis: Improved accuracy from 84.85% to 86.00% (+1.15%), with 100% response validity.
Ideal Use Cases
This model is particularly well-suited for:
- Applications requiring basic Swahili text generation and understanding.
- Question answering systems in Swahili.
- Sentiment analysis of Swahili text.
- Deployment in resource-constrained environments due to its small size and efficiency.