mlfoundations-dev/simpo-evol_tt_5s
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kLicense:llama3.1Architecture:Transformer Warm

The mlfoundations-dev/simpo-evol_tt_5s is an 8 billion parameter language model, fine-tuned from mlfoundations-dev/evol_tt_5s on the mlfoundations-dev/gemma2-ultrafeedback-armorm dataset. This model demonstrates a reward accuracy of 0.8001, indicating its capability in distinguishing preferred responses. It is optimized for tasks requiring nuanced understanding and preference alignment, making it suitable for applications where response quality and alignment with human feedback are critical.

Loading preview...