ai-for-good-lab/byol-nya-12b-cpt
The ai-for-good-lab/byol-nya-12b-cpt is a 12 billion parameter continually pre-trained (CPT) language model developed by ai-for-good-lab, based on Google's Gemma 3 architecture. It has been adapted for Chichewa (nya) using the BYOL framework, extending its fluency in Chichewa while retaining English capabilities. With a 32768 token context length, this base model is primarily designed for text completion tasks in both Chichewa and English.
Loading preview...
Overview
This model, byol-nya-12b-cpt, is a 12 billion parameter continually pre-trained (CPT) language model developed by ai-for-good-lab. It is built upon the google/gemma-3-12b-pt base model and has been specifically adapted for the Chichewa (nya) language. The adaptation process utilized the BYOL framework, which involved further training on a curated bilingual corpus of Chichewa and English text.
Key Capabilities
- Bilingual Fluency: Extends the base model's knowledge and fluency in Chichewa while preserving its English language capabilities.
- Continual Pre-Training: Represents a CPT stage, enhancing the model's understanding and generation in the target low-resource language.
- Text Completion: As a base model (not instruction-tuned), its primary strength lies in generating coherent text completions.
Good For
- Research and Development: Ideal for researchers exploring low-resource language adaptation and continual pre-training techniques.
- Chichewa Language Applications: Suitable for applications requiring text generation or understanding in Chichewa.
- Foundation for Fine-tuning: Can serve as a strong foundation for further instruction-tuning or task-specific fine-tuning for Chichewa-English bilingual tasks. For instruction-following, a merged variant is available.