pkupie/Qwen2.5-3B-bo-cpt

TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kPublished:Apr 28, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The pkupie/Qwen2.5-3B-bo-cpt model is a 3.1 billion parameter language model continually pretrained on the Tibetan portion of the MC^2 Corpus, building upon the Qwen2.5-3B architecture. This checkpoint is specifically optimized for Tibetan language modeling, aiming to enhance performance in this low-resource language. Its primary use case is for research in low-resource language adaptation, particularly as a base for model merging and logit fusion techniques.

Loading preview...

Qwen2.5-3B Continually Pretrained on Tibetan

This model, pkupie/Qwen2.5-3B-bo-cpt, is a 3.1 billion parameter language model that has undergone continual pretraining (CPT). It is built upon the robust Qwen2.5-3B architecture and has been further trained specifically on the Tibetan subset of the MC^2 Corpus.

Key Capabilities

  • Enhanced Tibetan Language Modeling: Optimized to improve performance and understanding of the Tibetan language.
  • Low-Resource Language Adaptation: Serves as a specialized checkpoint for research into adapting large language models to languages with limited data.

Good for

  • Research Purposes: Primarily intended for academic and research use, particularly in the field of low-resource NLP.
  • Base Model for Further Work: Suitable as a foundational model for experiments involving model merging and logit fusion techniques, as detailed in the associated research paper: "Efficient Low-Resource Language Adaptation via Multi-Source Dynamic Logit Fusion".
  • Tibetan Language Applications: Potentially useful for developing applications requiring strong Tibetan language capabilities.