qqo/dpo-qwen-cot-merged
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Feb 17, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

qqo/dpo-qwen-cot-merged is a fine-tuned Qwen1.5-Instruct model, optimized using Direct Preference Optimization (DPO) via the Unsloth library. This model focuses on enhancing reasoning capabilities through Chain-of-Thought (CoT) and improving structured response quality. It provides full-merged 16-bit weights, making it suitable for direct use in applications requiring improved logical coherence and structured output.

Loading preview...