ai-for-good-lab/byol-mri-12b-cpt

VISIONConcurrency Cost:1Model Size:12BQuant:FP8Ctx Length:32kPublished:Apr 15, 2026License:gemmaArchitecture:Transformer Cold

The ai-for-good-lab/byol-mri-12b-cpt is a 12 billion parameter continually pre-trained language model, developed by ai-for-good-lab using Microsoft's BYOL framework. Derived from Google's Gemma 3 base model, it is specifically adapted for the Māori language (mri) through training on a curated bilingual corpus. This model excels at text completion tasks in Māori while maintaining English capabilities.

Loading preview...

Model Overview

The ai-for-good-lab/byol-mri-12b-cpt is a 12 billion parameter language model developed by ai-for-good-lab, leveraging Microsoft's BYOL framework. It is a continually pre-trained (CPT) model, starting from the Google Gemma 3 12b base. The model has been specifically adapted for the Māori language (mri) through further training on a curated bilingual corpus of Māori and English text. This process extends the base model's fluency and knowledge in Māori, while preserving its existing English capabilities.

Key Capabilities

  • Māori Language Adaptation: Specialized training for enhanced performance in Māori.
  • Bilingual Proficiency: Maintains English capabilities alongside new Māori fluency.
  • Continual Pre-Training: Utilizes the BYOL framework for efficient language extension.
  • Base Model Functionality: Designed for foundational language understanding and generation.

Ideal Use Cases

  • Text Completion: Best suited for generating coherent and contextually relevant text.
  • Māori Language Research: Valuable for studies and applications involving the Māori language.
  • Foundation for Fine-tuning: Can serve as a strong base for further instruction-tuning or task-specific adaptations in Māori or bilingual contexts.