Name: canbingol/gemma3_1B_base-tr-cpt-3epoch_15k_data API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: canbingol

Overview

This model, developed by canbingol, is a Turkish Continued Pretraining (CPT) variant of the google/gemma-3-1b-pt base model. It has been further trained for 3 epochs on the initial 15,000 samples from a Turkish web corpus, specifically canbingol/vngrs-web-corpus-200k.

Key Capabilities

Enhanced Turkish Language Modeling: Improved proficiency in generating and understanding Turkish text due to targeted pretraining.
Domain Familiarity: Increased familiarity with Turkish web content, making it suitable for tasks related to Turkish digital media.
Research and Experimental Use: Primarily designed for academic and experimental exploration of Turkish NLP applications.

Good for

Turkish Text Generation: Creating coherent and contextually relevant text in Turkish.
Turkish Language Understanding: Tasks requiring a nuanced grasp of the Turkish language.
Exploratory NLP Research: Investigating the impact of continued pretraining on smaller models for specific languages.
Developing Turkish-centric Applications: As a foundational model for building applications that require strong Turkish language capabilities.

Overview

Overview

Key Capabilities

Good for

Full Model Card (README)