pkupie/gemma-3-4b-bo-cpt
VISIONConcurrency Cost:1Model Size:4.3BQuant:BF16Ctx Length:32kPublished:Apr 28, 2026License:gemmaArchitecture:Transformer Cold
pkupie/gemma-3-4b-bo-cpt is a 4.3 billion parameter Gemma 3 PT model continually pre-trained on the Tibetan portion of the MC^2 Corpus. Developed by pkupie, this model is specifically designed to enhance Tibetan language modeling capabilities. It serves as a research checkpoint for low-resource language adaptation, particularly for model merging and logit fusion.
Loading preview...
Overview
This model, pkupie/gemma-3-4b-bo-cpt, is a 4.3 billion parameter Gemma 3 PT checkpoint that has undergone continual pretraining (CPT) specifically on the Tibetan language. The pretraining utilized the Tibetan subset of the MC^2 Corpus.
Key Characteristics
- Base Model: Gemma 3 PT 4B.
- Language Focus: Exclusively pre-trained on Tibetan (
bo) data. - Training Method: Continual pretraining (CPT) to adapt the base model to a low-resource language.
- Research Focus: Intended to support research in low-resource language adaptation, model merging, and logit fusion techniques.
Intended Use Cases
This model is primarily released for research purposes.
- Tibetan Language Modeling: Improves language understanding and generation for Tibetan.
- Base for Further Research: Can be used as a foundational checkpoint for subsequent research, especially in areas like model merging and dynamic logit fusion, as detailed in the paper "Efficient Low-Resource Language Adaptation via Multi-Source Dynamic Logit Fusion" (ACL 2026).