ogwata/exp27-dpo-r16
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Feb 28, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

The ogwata/exp27-dpo-r16 is a 4 billion parameter language model, fine-tuned from ogwata/exp26-sft-r16-merged using Direct Preference Optimization (DPO). This model incorporates full-merged 16-bit weights, eliminating the need for adapter loading. It is optimized for generating responses aligned with human preferences, making it suitable for applications requiring nuanced and preferred text outputs.

Loading preview...