mlfoundations-dev/simpo-oh-dcft-v3.1-llama-3.1-nemotron-70b
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kLicense:llama3.1Architecture:Transformer Warm
The mlfoundations-dev/simpo-oh-dcft-v3.1-llama-3.1-nemotron-70b is an 8 billion parameter language model, fine-tuned from mlfoundations-dev/oh-dcft-v3.1-llama-3.1-nemotron-70b. It was trained on the mlfoundations-dev/gemma2-ultrafeedback-armorm dataset and features a 32,768 token context length. This model is optimized for tasks requiring high reward accuracy, achieving 0.8145 on its evaluation set.
Loading preview...