wololoo/Llama-3.2-3B-TR-Instruct-DPO

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.2BQuant:BF16Ctx Length:32kPublished:Jan 9, 2026License:llama3.2Architecture:Transformer0.0K Warm

wololoo/Llama-3.2-3B-TR-Instruct-DPO is a 3.2 billion parameter causal language model developed by wololoo, fine-tuned from unsloth/Llama-3.2-3B-Instruct. This model is specifically optimized for Turkish language capabilities and enhanced reasoning skills in STEM fields, utilizing a two-stage SFT and DPO training pipeline. It serves as a general-purpose Turkish assistant, excelling in Turkish question-answering and providing foundational information in science, technology, and engineering topics, with a context length of 32768 tokens.

Loading preview...

wololoo/Llama-3.2-3B-TR-Instruct-DPO Overview

This model, developed by wololoo, is a Turkish-optimized version of the Llama-3.2-3B-Instruct base model. It has undergone a two-stage training process (SFT + DPO) to significantly improve its Turkish language proficiency and its reasoning capabilities within STEM (Science, Technology, Engineering, Mathematics) domains.

Key Capabilities & Training

  • Enhanced Turkish Language: Supervised Fine-Tuning (SFT) was performed using the atasoglu/databricks-dolly-15k-tr dataset to improve general conversational abilities and instruction following in Turkish.
  • STEM Reasoning: Direct Preference Optimization (DPO) was applied using the yusufbaykaloglu/Turkish-STEM-DPO-Dataset to specifically refine the model's answer quality in scientific and technical subjects.
  • Model Type: It is a 3.2 billion parameter Causal Language Model (Transformer) with a 32768 token context length.

Ideal Use Cases

This model functions effectively as a general-purpose Turkish AI assistant, particularly strong in:

  • Turkish question-answering (QA).
  • Providing fundamental information in STEM fields.
  • Text generation, summarization, and editing in Turkish.
  • Chatbot applications requiring Turkish interaction.

Limitations

Like all large language models, it may exhibit "hallucinations" or reflect biases from its training data. Responses, especially on technical or critical subjects, should always be verified. It is not intended for medical, legal, or financial advice, nor for generating harmful or illegal content.