SumiYama/dpo-qwen-cot-merged
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Feb 4, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

SumiYama/dpo-qwen-cot-merged is a Qwen3-4B-Instruct-2507 based language model developed by SumiYama, fine-tuned using LoRA for specific agent-based tasks. This model specializes in handling DB_Bench (SQL) and ALFWorld (household task) formats, making it suitable for applications requiring structured interaction and task execution. It leverages a merged LoRA SFT approach to enhance performance on these targeted agent tasks.

Loading preview...