pkupie/gemma-3-4b-mn-cpt

VISIONConcurrency Cost:1Model Size:4.3BQuant:BF16Ctx Length:32kPublished:Apr 29, 2026License:gemmaArchitecture:Transformer Cold

pkupie/gemma-3-4b-mn-cpt is a 4.3 billion parameter continual pretraining (CPT) checkpoint of Gemma 3 PT 4B, specifically fine-tuned on the Mongolian (Traditional Mongolian Script) portion of the MC^2 Corpus. This model is designed to enhance language modeling for Mongolian (Traditional Mongolian Script) and supports research into low-resource language adaptation. Its primary use case is as a base model for further research, particularly in model merging and logit fusion techniques.

Loading preview...

Overview

pkupie/gemma-3-4b-mn-cpt is a 4.3 billion parameter continual pretraining (CPT) checkpoint derived from Gemma 3 PT 4B. This model has been further pretrained on the Mongolian (Traditional Mongolian Script) subset of the MC^2 Corpus, with a context length of 32768 tokens. It is specifically developed to improve language modeling capabilities for Mongolian (Traditional Mongolian Script) and to facilitate research in adapting models for low-resource languages.

Key Capabilities

  • Enhanced Mongolian Language Modeling: Specialized for the Traditional Mongolian Script, aiming to improve performance in this low-resource language.
  • Research Base Model: Intended as a foundational checkpoint for advanced research, particularly in areas like model merging and logit fusion.
  • Low-Resource Language Adaptation: Supports studies and applications focused on adapting large language models to languages with limited available data.

Good for

  • Researchers working on Mongolian (Traditional Mongolian Script) language processing.
  • Projects exploring continual pretraining (CPT) methodologies.
  • Experiments in model merging and logit fusion techniques.
  • Academic research on efficient low-resource language adaptation, as detailed in the associated paper "Efficient Low-Resource Language Adaptation via Multi-Source Dynamic Logit Fusion" (ACL 2026).