nvidia/Llama-3.3-Nemotron-70B-Reward
TEXT GENERATIONConcurrency Cost:4Model Size:70BQuant:FP8Ctx Length:32kPublished:May 28, 2025License:nvidia-open-model-licenseArchitecture:Transformer0.0K Open Weights Cold

The nvidia/Llama-3.3-Nemotron-70B-Reward is a 70 billion parameter reward model developed by NVIDIA, built upon the Meta-Llama-3.3-70B-Instruct foundation. It is fine-tuned using scaled Bradley-Terry modeling to predict the quality of LLM-generated responses, assigning a reward score to the final assistant turn in an English conversation up to 4,096 tokens. This model excels at evaluating response quality, achieving 73.7% on JudgeBench and 79.9% on RM-Bench, making it suitable for ranking and improving LLM outputs.

Loading preview...