Name: ai-for-good-lab/byol-nya-12b-cpt API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: ai-for-good-lab

Overview

This model, byol-nya-12b-cpt, is a 12 billion parameter continually pre-trained (CPT) language model developed by ai-for-good-lab. It is built upon the google/gemma-3-12b-pt base model and has been specifically adapted for the Chichewa (nya) language. The adaptation process utilized the BYOL framework, which involved further training on a curated bilingual corpus of Chichewa and English text.

Key Capabilities

Bilingual Fluency: Extends the base model's knowledge and fluency in Chichewa while preserving its English language capabilities.
Continual Pre-Training: Represents a CPT stage, enhancing the model's understanding and generation in the target low-resource language.
Text Completion: As a base model (not instruction-tuned), its primary strength lies in generating coherent text completions.

Good For

Research and Development: Ideal for researchers exploring low-resource language adaptation and continual pre-training techniques.
Chichewa Language Applications: Suitable for applications requiring text generation or understanding in Chichewa.
Foundation for Fine-tuning: Can serve as a strong foundation for further instruction-tuning or task-specific fine-tuning for Chichewa-English bilingual tasks. For instruction-following, a merged variant is available.

Overview

Overview

Key Capabilities

Good For

Full Model Card (README)