Name: pkupie/Qwen2.5-1.5B-kk-cpt API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: pkupie

Overview

This model, pkupie/Qwen2.5-1.5B-kk-cpt, is a specialized checkpoint derived from the Qwen2.5 1.5B base model. It has undergone continual pretraining (CPT) specifically on the Kazakh language, utilizing the Arabic Script subset of the MC^2 Corpus.

Key Capabilities

Enhanced Kazakh Language Modeling: Significantly improves performance for Kazakh language tasks, particularly for the Arabic script variant.
Low-Resource Language Adaptation: Designed to support research and development in adapting large language models to languages with limited data resources.
Research Base Model: Serves as a foundational model for advanced research, especially in methodologies like model merging and dynamic logit fusion, as detailed in the paper "Efficient Low-Resource Language Adaptation via Multi-Source Dynamic Logit Fusion" (ACL 2026).

Intended Use Cases

Academic Research: Ideal for researchers exploring techniques for low-resource language processing and model adaptation.
Base for Fine-tuning: Can be used as a starting point for further fine-tuning on specific Kazakh language tasks.
Model Merging Experiments: Particularly relevant for experiments involving the combination of models or logit fusion techniques.

Overview

Overview

Key Capabilities

Intended Use Cases

Full Model Card (README)