nvidia/Llama-3.3-Nemotron-70B-Reward-Multilingual
TEXT GENERATIONConcurrency Cost:4Model Size:70BQuant:FP8Ctx Length:32kPublished:May 28, 2025License:nvidia-open-model-licenseArchitecture:Transformer0.0K Open Weights Warm

The nvidia/Llama-3.3-Nemotron-70B-Reward-Multilingual is a 70 billion parameter reward model developed by NVIDIA, built upon the Meta-Llama-3.3-70B-Instruct foundation. It is fine-tuned using scaled Bradley-Terry modeling to predict the quality of LLM-generated responses in a multilingual context. This model excels at assigning reward scores to assistant turns in conversations, indicating response quality, and achieves top performance on RM-Bench (82.4%) and JudgeBench (69.4%) among Bradley-Terry Reward Models.

Loading preview...