ai-for-good-lab/byol-mri-12b-merged
The ai-for-good-lab/byol-mri-12b-merged model is a 12 billion parameter language model developed by Microsoft's BYOL framework, based on Google's Gemma 3 architecture. It is specifically designed and optimized for the Māori language, combining continual pre-training and instruction-following capabilities through model merging. This model excels at chat and instruction-following tasks in Māori, making it suitable for applications requiring robust performance in this low-resource language.
Loading preview...
Overview
This model, developed by Microsoft's BYOL framework, is a 12 billion parameter language model specifically tailored for the Māori (mri) language. It is built upon the google/gemma-3-12b-pt base model and represents a merged checkpoint, combining both continual pre-training (CPT) and instruction-tuning (IT) stages. The merging process integrates language knowledge and instruction-following capabilities back into the original Gemma 3 instruction model, making it a comprehensive solution for Māori language processing.
Key Capabilities
- Māori Language Specialization: Highly optimized for understanding and generating text in Māori.
- Instruction Following: Supports chat and instruction-following tasks, making it suitable for interactive applications.
- Strong Performance: Achieves robust performance on Māori benchmarks, as detailed in the associated research paper.
- Merged Architecture: Benefits from both extensive language exposure and fine-tuned instruction adherence.
Recommended Use
This byol-mri-12b-merged model is the recommended choice for most users working with Māori language applications. Its combined CPT and IT stages ensure strong overall performance for tasks requiring both language fluency and the ability to follow instructions. For detailed evaluation results, users are encouraged to refer to the BYOL paper.