Pawa-Gemma-Swahili-2B by sartifyllc is a 2.6 billion parameter language model built on the Gemma-2 base architecture, specifically fine-tuned for Swahili and English. It features a custom tokenizer and leverages supervised fine-tuning (SFT) and Direct Preference Optimization (DPO) on Swahili datasets. This model excels in contextually rich Swahili-focused tasks, general assistance, and chat-based interactions, making it suitable for applications requiring nuanced understanding in both languages.
Loading preview...
PAWA: Swahili SML for Various Tasks
PAWA (Pawa-Gemma-Swahili-2B) is a 2.6 billion parameter language model developed by sartifyllc, built upon the Gemma-2 base architecture. It is uniquely specialized for Swahili and English, incorporating a custom tokenizer to enhance multi-language flexibility. The model's development involved extensive supervised fine-tuning (SFT) and Direct Preference Optimization (DPO) using Swahili datasets, which contributes to its improved performance and consistent responses.
Key Capabilities
- Swahili and English Proficiency: Designed for nuanced understanding and interaction in both languages.
- Contextual Understanding: Excels in tasks requiring deep contextual comprehension, particularly in Swahili.
- Optimized Responses: Leverages DPO for deterministic and consistent output generation.
- Chat Template Support: Configurable with various chat templates (e.g., ChatML) for seamless conversational experiences.
Good for
- Contextually Rich Swahili-focused Tasks: Ideal for applications where precise Swahili language processing is critical.
- General Assistance and Chatbots: Suitable for general-purpose chat environments and providing structured answers.
- Retrieval-Augmented Generation (RAG): Works effectively in RAG systems and other specific use cases requiring augmented generation.