Name: canbingol/gemma3_1B_base-tr-cpt-1epoch_stage2 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: canbingol

Overview

This model, gemma3_1B_base-tr-cpt-1epoch_stage2, is a Stage 2 Turkish Continued Pretraining (CPT) variant of the Gemma-3-1B model, developed by Can Bingol. It is specifically designed to enhance its understanding and generation capabilities in Turkish.

Key Characteristics

Continued Pretraining: This model was initialized from canbingol/gemma3_1B_base-tr-cpt-1epoch_stage1, representing a sequential CPT approach.
Turkish Domain Adaptation: It was trained for one epoch on a new, disjoint subset (samples 50,000–100,000) of a Turkish web corpus, building on the 0–50,000 samples from Stage 1.
Cumulative Training: Cumulatively, this model has been exposed to approximately 43 million tokens from the first 100,000 samples of the Turkish web corpus.
Base Model: It is based on the Gemma-3-1B architecture, providing a compact yet capable foundation.

Use Cases

This model is particularly well-suited for:

Turkish Language Applications: Ideal for tasks requiring strong performance in Turkish, such as text generation, summarization, or translation within a Turkish context.
Further Fine-tuning: Serves as a robust base for subsequent fine-tuning on specific Turkish downstream tasks.
Research in CPT: Useful for researchers exploring sequential continued pretraining strategies and domain adaptation for low-resource languages.

Overview

Overview

Key Characteristics

Use Cases

Full Model Card (README)