ojaffe/2026-04-09-310000-lora-dpo-14b-v1
TEXT GENERATIONConcurrency Cost:1Model Size:14BQuant:FP8Ctx Length:32kPublished:Apr 10, 2026Architecture:Transformer Cold

The ojaffe/2026-04-09-310000-lora-dpo-14b-v1 model is a 14 billion parameter language model fine-tuned from Qwen/Qwen3-14B. It was trained using Direct Preference Optimization (DPO) with the TRL framework, enhancing its ability to align with human preferences. This model is designed for general text generation tasks, particularly those benefiting from preference-based fine-tuning.

Loading preview...