ai-for-good-lab/byol-nya-4b-cpt

VISIONConcurrency Cost:1Model Size:4.3BQuant:BF16Ctx Length:32kPublished:Apr 15, 2026License:gemmaArchitecture:Transformer Cold

The ai-for-good-lab/byol-nya-4b-cpt is a 4.3 billion parameter continually pre-trained (CPT) language model developed by ai-for-good-lab, based on Google's Gemma 3. This model is specifically adapted for the Chichewa (nya) language, extending its fluency while retaining English capabilities. It is primarily designed for text completion tasks in Chichewa and English, leveraging a 32768 token context length.

Loading preview...

ai-for-good-lab/byol-nya-4b-cpt: Chichewa Continual Pre-Training

This model, developed by ai-for-good-lab, is a 4.3 billion parameter continually pre-trained (CPT) language model. It is built upon the google/gemma-3-4b-pt base model and has been specifically adapted for the Chichewa (nya) language using the BYOL framework. The training involved a curated bilingual corpus of Chichewa and English text, enhancing the model's proficiency in Chichewa while preserving its English language abilities.

Key Capabilities

  • Chichewa Language Adaptation: Significantly extends the base Gemma model's knowledge and fluency in Chichewa.
  • Bilingual Proficiency: Retains strong English capabilities alongside its new Chichewa understanding.
  • Continual Pre-Training: Benefits from additional training on a specialized dataset, improving performance for the target language.
  • Large Context Window: Supports a 32768 token context length, allowing for processing longer texts.

Good For

  • Text Completion: As a base (non-instruction-tuned) model, it excels at generating continuations for given prompts.
  • Chichewa Language Applications: Ideal for research and development involving natural language processing in Chichewa.
  • Foundation for Further Fine-tuning: Can serve as a strong base for instruction-tuning or other downstream tasks requiring Chichewa language understanding.