zero9tech/Qwen3-8B-Wikipedia-TR-CPT

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Apr 13, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

The zero9tech/Qwen3-8B-Wikipedia-TR-CPT model is a Qwen3-8B variant developed by Zero9 Tech, specifically adapted for Turkish language reasoning and technical expression. It underwent Continued PreTraining (CPT) using a QLoRA-based approach, primarily on Turkish Wikipedia data (approximately 99% of the CPT dataset). This adaptation aims to enhance the model's ability to provide coherent reasoning, structured explanations for information-intensive queries, and improved flow in technical/analytical responses in Turkish.

Loading preview...

Model Overview

The zero9tech/Qwen3-8B-Wikipedia-TR-CPT is a specialized model developed by Zero9 Tech, focusing on enhancing Turkish language capabilities. Its primary goal is to improve reasoning and technical expression quality in Turkish contexts.

Key Adaptations and Training

This model utilizes a QLoRA-based Continued PreTraining (CPT) approach, rather than full-parameter re-pretraining. The base model was loaded in 4-bit, with updates applied via LoRA adapters. The CPT process predominantly used Turkish content from the wikimedia/wikipedia dataset, comprising approximately 99% of the adaptation data mix.

LoRA layers were applied to critical modules including q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj, embed_tokens, and lm_head. Key LoRA settings included r = 128, lora_alpha = 128, and use_rslora = True. The training data was formatted by combining Turkish Wikipedia article titles and bodies, with an EOS_TOKEN appended to each example.

Intended Use Cases

  • Improved Turkish Reasoning: Designed to provide more consistent reasoning within Turkish contexts.
  • Structured Explanations: Aims to offer more organized explanations for information-intensive questions in Turkish.
  • Enhanced Technical Responses: Expected to deliver better flow in technical and analytical answers in Turkish.

Important Considerations

Users should be aware that the model may carry biases reflecting its training data distribution. For critical applications such as legal, health, or financial domains, human expert review is strongly recommended. The model is released under the Apache-2.0 license.