ottys/dpo-qwen-cot-merged
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Feb 28, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

The ottys/dpo-qwen-cot-merged model is a 4 billion parameter instruction-tuned causal language model based on the Qwen3-4B-Instruct-2507 architecture. Developed by ottys, it utilizes Direct Preference Optimization (DPO) on a filtered subset of official DPO data, focusing on enhancing structured data output accuracy and Chain-of-Thought reasoning. With a 32768 token context length, this model is specifically optimized for tasks requiring precise structured outputs and improved inference processes.

Loading preview...