ai-for-good-lab/byol-mri-4b-merged
ai-for-good-lab/byol-mri-4b-merged is a 4.3 billion parameter language model developed by ai-for-good-lab, based on Google's Gemma-3-4b-pt. This model is specifically designed for the Māori language, combining continual pre-training and instruction-following capabilities through model merging. It excels at chat and instruction-following tasks in Māori, offering strong performance for low-resource language applications.
Loading preview...
Overview
The ai-for-good-lab/byol-mri-4b-merged model is a 4.3 billion parameter language model specifically developed for the Māori (mri) language. It is built upon the google/gemma-3-4b-pt base model and was created using the BYOL framework by Microsoft, which aims to extend LLMs to low-resource languages. This model represents a merged checkpoint, integrating language knowledge from continual pre-training with instruction-following capabilities from supervised fine-tuning.
Key Capabilities
- Māori Language Proficiency: Optimized for understanding and generating text in Māori.
- Instruction Following: Designed to respond to instructions and engage in chat-based interactions.
- Model Merging Approach: Combines two distinct training stages (continual pre-training and instruction tuning) into a single, robust model, enhancing overall performance.
- Strong Performance: Recommended for most users due to its strong overall performance on Māori benchmarks, as detailed in the associated paper.
When to Use This Model
- Māori Language Applications: Ideal for any use case requiring natural language processing in Māori.
- Chatbots and Conversational AI: Suitable for developing instruction-following agents or chatbots that interact in Māori.
- Low-Resource Language Development: A prime example of extending powerful LLMs to languages with limited digital resources, offering a robust solution for such contexts.