ai-for-good-lab/byol-nya-4b-merged

VISIONConcurrency Cost:1Model Size:4.3BQuant:BF16Ctx Length:32kPublished:Apr 15, 2026License:gemmaArchitecture:Transformer Cold

The ai-for-good-lab/byol-nya-4b-merged is a 4.3 billion parameter language model developed by ai-for-good-lab, based on Google's Gemma-3-4b-pt architecture, with a 32K context length. It is specifically designed for the Chichewa (nya) language, combining continual pre-training and instruction-following capabilities through model merging. This model is optimized for chat and instruction-following in Chichewa, offering strong performance for low-resource language applications.

Loading preview...

Overview

This model, developed by ai-for-good-lab, is a 4.3 billion parameter language model specifically adapted for the Chichewa (nya) language. Built upon the google/gemma-3-4b-pt base model, it leverages the BYOL framework to extend LLM capabilities to low-resource languages. It integrates knowledge from continual pre-training and instruction-following through a merging process, making it the recommended version for most users seeking a robust Chichewa LLM.

Key Capabilities

  • Chichewa Language Proficiency: Specialized for understanding and generating text in Chichewa.
  • Instruction Following: Combines continual pre-training with supervised fine-tuning for effective instruction-following.
  • Merged Architecture: Achieves strong overall performance by merging CPT and IT checkpoints into the original Gemma 3 instruction model.
  • Chat Support: Designed to support chat and conversational interactions in Chichewa.

Good For

  • Applications requiring Chichewa language support: Ideal for chatbots, content generation, and other NLP tasks in Chichewa.
  • Research in low-resource language LLMs: Demonstrates an effective method for adapting large language models to languages with limited digital resources.
  • Instruction-tuned tasks: Excels in scenarios where the model needs to follow specific instructions or prompts in Chichewa.