HCY123902/qwen25_7b_base_hc_tsss_n32_r1_dpo
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Apr 10, 2026Architecture:Transformer Cold

HCY123902/qwen25_7b_base_hc_tsss_n32_r1_dpo is a 7.6 billion parameter language model fine-tuned from Qwen/Qwen2.5-7B. This model was trained using Direct Preference Optimization (DPO) with TRL, enhancing its ability to align with human preferences. With a context length of 32768 tokens, it is optimized for generating responses that reflect desired conversational qualities.

Loading preview...