mlfoundations-dev/simpo-oh_teknium_scaling_down_ratiocontrolled_0.9
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kLicense:llama3.1Architecture:Transformer Warm

The mlfoundations-dev/simpo-oh_teknium_scaling_down_ratiocontrolled_0.9 is an 8 billion parameter language model, fine-tuned from mlfoundations-dev/oh_teknium_scaling_down_ratiocontrolled_0.9. This model was trained on the mlfoundations-dev/gemma2-ultrafeedback-armorm dataset, achieving a loss of 2.9107 and a reward accuracy of 0.7604 on its evaluation set. It is designed for tasks benefiting from fine-tuning on preference data, with a context length of 32768 tokens.

Loading preview...