ojaffe/20260411-190341-align-qwen-0d3d-2026-04-12-023-moderate-ob-dpo
TEXT GENERATIONConcurrency Cost:1Model Size:0.8BQuant:BF16Ctx Length:32kPublished:Apr 12, 2026Architecture:Transformer Cold

The ojaffe/20260411-190341-align-qwen-0d3d-2026-04-12-023-moderate-ob-dpo model is an 0.8 billion parameter language model fine-tuned using Direct Preference Optimization (DPO). This model is based on an unspecified base model and has a context length of 32768 tokens. Its training with DPO suggests an optimization for aligning with human preferences, making it suitable for tasks requiring nuanced response generation and moderation.

Loading preview...