pkupie/Qwen2.5-1.5B-mn-cpt

TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Apr 28, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

pkupie/Qwen2.5-1.5B-mn-cpt is a 1.5 billion parameter language model continually pretrained from Qwen2.5-1.5B on the Mongolian (Traditional Mongolian Script) portion of the MC^2 Corpus. Developed by pkupie, this model is specifically designed to improve language modeling for low-resource Mongolian (Traditional Mongolian Script) and support research in language adaptation. It features a 32768 token context length and is primarily intended for research purposes, particularly in model merging and logit fusion.

Loading preview...

Overview

This model, pkupie/Qwen2.5-1.5B-mn-cpt, is a continually pretrained (CPT) checkpoint based on the Qwen2.5-1.5B architecture. It has been further trained on the Mongolian (Traditional Mongolian Script) subset of the MC^2 Corpus.

Key Capabilities

  • Low-Resource Language Adaptation: Specifically enhanced for the Mongolian language (Traditional Mongolian Script), aiming to improve its performance in this low-resource setting.
  • Research Focus: Primarily released for research, offering a base model for advanced techniques like model merging and logit fusion.
  • Training Methodology: Utilizes continual pretraining (CPT) from an existing Qwen2.5-1.5B model, as detailed in the paper "Efficient Low-Resource Language Adaptation via Multi-Source Dynamic Logit Fusion" (ACL 2026).

Intended Use Cases

  • Mongolian Language Modeling Research: Ideal for researchers working on improving language understanding and generation for Mongolian (Traditional Mongolian Script).
  • Model Merging & Logit Fusion: Serves as a suitable base model for experiments involving the combination of different models or logit fusion techniques.
  • Low-Resource NLP Studies: Contributes to the broader field of low-resource natural language processing research.