tomofusa/exp033-dpo-wd005-merged
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Mar 1, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

The tomofusa/exp033-dpo-wd005-merged model is a 4 billion parameter language model developed by tomofusa, built upon a SFT and DPO merged architecture. This model is provided with full 16-bit weights, eliminating the need for adapter loading. It is specifically fine-tuned using a DPO configuration with a learning rate of 5e-07 and a beta of 0.1, making it suitable for tasks benefiting from advanced alignment techniques.

Loading preview...