Alfaxad/gemma2-2b-swahili-it

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:2.6BQuant:BF16Ctx Length:8kPublished:Jan 11, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

Alfaxad/gemma2-2b-swahili-it is a 2.6 billion parameter decoder-only transformer model developed by Alfaxad Eyembe, fine-tuned from Google's Gemma2-2B-IT. This lightweight model is optimized for natural Swahili language understanding and generation, offering a resource-efficient solution for tasks like text generation, question answering, and sentiment analysis in Swahili. It demonstrates improved performance on Swahili MMLU and sentiment analysis compared to its base model, making it suitable for resource-constrained environments.

Loading preview...

Model Overview

Gemma2-2B-Swahili-IT is a 2.6 billion parameter, decoder-only transformer model developed by Alfaxad Eyembe. It is a fine-tuned variant of Google's Gemma2-2B-IT, specifically optimized for the Swahili language. The model was fine-tuned using Low-Rank Adaptation (LoRA) on a comprehensive dataset of over 67,000 instruction-response pairs, totaling more than 16 million tokens of high-quality, naturally-written Swahili content.

Key Capabilities

  • Swahili Language Proficiency: Designed for natural Swahili understanding and generation.
  • Instruction Following: Capable of general instruction following in Swahili.
  • Text Generation: Performs basic Swahili text generation and simple creative writing.
  • Question Answering: Supports question answering tasks in Swahili.
  • Sentiment Analysis: Achieves 86.00% accuracy on sentiment analysis, an improvement over the base model.
  • Efficiency: Lightweight with 2.6 billion parameters, offering lower memory requirements and faster inference.

Performance Highlights

The fine-tuned model shows notable improvements over its base model:

  • Swahili MMLU: Increased accuracy from 31.58% to 38.60% (+7.02%).
  • Sentiment Analysis: Improved accuracy from 84.85% to 86.00% (+1.15%), with 100% response validity.

Ideal Use Cases

This model is particularly well-suited for:

  • Applications requiring basic Swahili text generation and understanding.
  • Question answering systems in Swahili.
  • Sentiment analysis of Swahili text.
  • Deployment in resource-constrained environments due to its small size and efficiency.