chenyongxi/Qwen2.5-1.5B-DPO-1.5B
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Apr 1, 2026Architecture:Transformer Cold

The chenyongxi/Qwen2.5-1.5B-DPO-1.5B model is a 1.5 billion parameter language model, fine-tuned using Direct Preference Optimization (DPO) on the BAAI/Infinity-Preference dataset. This model, based on the Qwen2.5 architecture, is designed for generating human-like text responses, particularly in conversational or preference-aligned scenarios. It leverages a 32768-token context length, making it suitable for tasks requiring understanding and generation over longer inputs. Its DPO training aims to align its outputs more closely with human preferences.

Loading preview...