ai-for-good-lab/byol-nya-4b-it
The ai-for-good-lab/byol-nya-4b-it model is a 4.3 billion parameter instruction-tuned language model for Chichewa (nya), developed by ai-for-good-lab using the BYOL framework. Built upon google/gemma-3-4b-pt, it specializes in instruction-following tasks in low-resource languages. This model serves as an intermediate checkpoint, providing instruction-following capabilities for Chichewa.
Loading preview...
Overview
The ai-for-good-lab/byol-nya-4b-it is a 4.3 billion parameter instruction-tuned language model specifically developed for Chichewa (nya), a low-resource language. It was created by ai-for-good-lab using the BYOL framework, which aims to extend large language models to languages with limited digital resources. The model is based on google/gemma-3-4b-pt and has undergone supervised fine-tuning (SFT) using translated instruction-following data (SmolTalk2 + AYA).
Key Capabilities
- Instruction-following in Chichewa: Designed to understand and respond to instructions in the Chichewa language.
- Low-resource language support: Leverages the BYOL framework to bring LLM capabilities to languages like Chichewa.
- Intermediate checkpoint: This model provides the instruction-following component, intended to be merged with a language-specific pre-trained model for optimal performance.
Usage Considerations
This model is an intermediate checkpoint focused on instruction tuning. For the best overall performance, including both language knowledge and instruction-following, users are advised to utilize the merged variant of this model, which combines the strengths of the language-specific pre-training and this instruction-tuned version.