sartifyllc/Pawa-Gemma-Swahili-2B
Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:2.6BQuant:BF16Ctx Length:8kPublished:Jan 13, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

Pawa-Gemma-Swahili-2B by sartifyllc is a 2.6 billion parameter language model built on the Gemma-2 base architecture, specifically fine-tuned for Swahili and English. It features a custom tokenizer and leverages supervised fine-tuning (SFT) and Direct Preference Optimization (DPO) on Swahili datasets. This model excels in contextually rich Swahili-focused tasks, general assistance, and chat-based interactions, making it suitable for applications requiring nuanced understanding in both languages.

Loading preview...

PAWA: Swahili SML for Various Tasks

PAWA (Pawa-Gemma-Swahili-2B) is a 2.6 billion parameter language model developed by sartifyllc, built upon the Gemma-2 base architecture. It is uniquely specialized for Swahili and English, incorporating a custom tokenizer to enhance multi-language flexibility. The model's development involved extensive supervised fine-tuning (SFT) and Direct Preference Optimization (DPO) using Swahili datasets, which contributes to its improved performance and consistent responses.

Key Capabilities

  • Swahili and English Proficiency: Designed for nuanced understanding and interaction in both languages.
  • Contextual Understanding: Excels in tasks requiring deep contextual comprehension, particularly in Swahili.
  • Optimized Responses: Leverages DPO for deterministic and consistent output generation.
  • Chat Template Support: Configurable with various chat templates (e.g., ChatML) for seamless conversational experiences.

Good for

  • Contextually Rich Swahili-focused Tasks: Ideal for applications where precise Swahili language processing is critical.
  • General Assistance and Chatbots: Suitable for general-purpose chat environments and providing structured answers.
  • Retrieval-Augmented Generation (RAG): Works effectively in RAG systems and other specific use cases requiring augmented generation.