ShenaoZhang/0.001_idpo_noreplacerej_iter_1
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Apr 7, 2024License:mitArchitecture:Transformer Open Weights Cold

ShenaoZhang/0.001_idpo_noreplacerej_iter_1 is a 7 billion parameter language model fine-tuned from HuggingFaceH4/mistral-7b-sft-beta. This model was trained using the HuggingFaceH4/ultrafeedback_binarized dataset, focusing on specific training hyperparameters including a learning rate of 5e-07 and a total batch size of 128. It is intended for tasks benefiting from its fine-tuning on feedback-binarized data, offering a specialized iteration for research and development.

Loading preview...