li-muyang/zephyr-7b-gemma-dpo
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Apr 21, 2025Architecture:Transformer Cold

The li-muyang/zephyr-7b-gemma-dpo is an 8 billion parameter language model based on the Gemma architecture, fine-tuned using Direct Preference Optimization (DPO). This model was trained from scratch and shows specific reward metrics for chosen and rejected responses, indicating its DPO-based alignment. It is intended for tasks benefiting from preference-based fine-tuning, though specific use cases require further definition.

Loading preview...