Overview

This model, developed by Microsoft's BYOL framework, is a 12 billion parameter language model specifically tailored for the Māori (mri) language. It is built upon the google/gemma-3-12b-pt base model and represents a merged checkpoint, combining both continual pre-training (CPT) and instruction-tuning (IT) stages. The merging process integrates language knowledge and instruction-following capabilities back into the original Gemma 3 instruction model, making it a comprehensive solution for Māori language processing.

Key Capabilities

Māori Language Specialization: Highly optimized for understanding and generating text in Māori.
Instruction Following: Supports chat and instruction-following tasks, making it suitable for interactive applications.
Strong Performance: Achieves robust performance on Māori benchmarks, as detailed in the associated research paper.
Merged Architecture: Benefits from both extensive language exposure and fine-tuned instruction adherence.

Recommended Use

This byol-mri-12b-merged model is the recommended choice for most users working with Māori language applications. Its combined CPT and IT stages ensure strong overall performance for tasks requiring both language fluency and the ability to follow instructions. For detailed evaluation results, users are encouraged to refer to the BYOL paper.

Overview

Overview

Key Capabilities

Recommended Use

Full Model Card (README)