Name: pkupie/Qwen2.5-1.5B-bo-cpt API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: pkupie

Overview

pkupie/Qwen2.5-1.5B-bo-cpt is a specialized 1.5 billion parameter language model developed by pkupie. It is a continual pretraining (CPT) checkpoint derived from the Qwen2.5 1.5B base model, with further pretraining exclusively on the Tibetan subset of the MC^2 Corpus. This targeted pretraining aims to significantly enhance its performance and understanding of the Tibetan language.

Key Capabilities

Tibetan Language Modeling: Specifically fine-tuned to improve language understanding and generation in Tibetan.
Low-Resource Language Adaptation: Designed to support research and development in adapting large language models to languages with limited data.
Research Base Model: Intended as a foundational checkpoint for advanced research, particularly in techniques like model merging and logit fusion, as detailed in the associated paper "Efficient Low-Resource Language Adaptation via Multi-Source Dynamic Logit Fusion" (ACL 2026).

Good for

Researchers focusing on Tibetan natural language processing (NLP) tasks.
Projects exploring low-resource language adaptation strategies.
Experiments involving model merging, logit fusion, or other advanced model combination techniques for specialized language domains.
Academic studies requiring a pre-trained model with strong Tibetan language capabilities.

Overview

Overview

Key Capabilities

Good for

Full Model Card (README)