HCY123902/qwen25_7b_base_hc_ssss_n32_r1_no_know_dpo
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Apr 9, 2026Architecture:Transformer Cold

HCY123902/qwen25_7b_base_hc_ssss_n32_r1_no_know_dpo is a 7.6 billion parameter language model, fine-tuned from Qwen/Qwen2.5-7B. This model was trained using Direct Preference Optimization (DPO) with TRL, enhancing its ability to align with human preferences. It features a context length of 32768 tokens, making it suitable for tasks requiring extensive contextual understanding. The fine-tuning process aims to improve its conversational capabilities and response quality.

Loading preview...