sfutenma/dpo-qwen3_4b-cot-merged_v260302-093614
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Mar 2, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

The sfutenma/dpo-qwen3_4b-cot-merged_v260302-093614 is a 4 billion parameter Qwen3-based language model, fine-tuned using Direct Preference Optimization (DPO) for enhanced reasoning (Chain-of-Thought) and structured response quality. It features a 32768 token context length and is optimized to align responses with preferred outputs. This model is suitable for applications requiring improved logical coherence and structured text generation.

Loading preview...