statking/zephyr-7b-sft-full-orpo
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kLicense:apache-2.0Architecture:Transformer Open Weights Cold

statking/zephyr-7b-sft-full-orpo is a 7 billion parameter language model fine-tuned from mistralai/Mistral-7B-v0.1. This model was trained using the ORPO method on the HuggingFaceH4/ultrafeedback_binarized dataset, achieving a rewards accuracy of 0.6587. It is optimized for tasks requiring alignment with human preferences, demonstrating improved performance in chosen versus rejected responses.

Loading preview...