pkupie/gemma-3-4b-bo-cpt

VISIONConcurrency Cost:1Model Size:4.3BQuant:BF16Ctx Length:32kPublished:Apr 28, 2026License:gemmaArchitecture:Transformer Cold

pkupie/gemma-3-4b-bo-cpt is a 4.3 billion parameter Gemma 3 PT model continually pre-trained on the Tibetan portion of the MC^2 Corpus. Developed by pkupie, this model is specifically designed to enhance Tibetan language modeling capabilities. It serves as a research checkpoint for low-resource language adaptation, particularly for model merging and logit fusion.

Loading preview...

Overview

This model, pkupie/gemma-3-4b-bo-cpt, is a 4.3 billion parameter Gemma 3 PT checkpoint that has undergone continual pretraining (CPT) specifically on the Tibetan language. The pretraining utilized the Tibetan subset of the MC^2 Corpus.

Key Characteristics

  • Base Model: Gemma 3 PT 4B.
  • Language Focus: Exclusively pre-trained on Tibetan (bo) data.
  • Training Method: Continual pretraining (CPT) to adapt the base model to a low-resource language.
  • Research Focus: Intended to support research in low-resource language adaptation, model merging, and logit fusion techniques.

Intended Use Cases

This model is primarily released for research purposes.

  • Tibetan Language Modeling: Improves language understanding and generation for Tibetan.
  • Base for Further Research: Can be used as a foundational checkpoint for subsequent research, especially in areas like model merging and dynamic logit fusion, as detailed in the paper "Efficient Low-Resource Language Adaptation via Multi-Source Dynamic Logit Fusion" (ACL 2026).