Name: nvidia/Llama-3.3-Nemotron-70B-Reward-Multilingual API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: nvidia

Model Overview

nvidia/Llama-3.3-Nemotron-70B-Reward-Multilingual is a 70 billion parameter reward model developed by NVIDIA, leveraging the Meta-Llama-3.3-70B-Instruct foundation. It is specifically fine-tuned using scaled Bradley-Terry modeling to assess the quality of LLM-generated responses in multilingual conversations. The model processes multi-turn conversations up to 4,096 tokens and outputs a single float value representing the quality of the final assistant turn.

Key Capabilities & Performance

Response Quality Scoring: Assigns a reward score to LLM-generated responses, where a higher score indicates higher quality. This score is relative to other responses for the same prompt.
Multilingual Support: Designed to evaluate responses across various languages.
Benchmark Leader: As of May 15, 2025, it achieves the highest score on RM-Bench at 82.4% and the second-highest on JudgeBench at 69.4% among Bradley-Terry Reward Models.
Foundation: Built on the Llama 3.3 Transformer architecture.

Use Cases

This model is ideal for:

LLM Evaluation: Programmatically assessing the quality of responses from other large language models.
Reinforcement Learning from Human Feedback (RLHF): Providing a reward signal for training or fine-tuning generative LLMs.
Response Ranking: Comparing and ranking different LLM outputs for a given prompt based on their predicted quality.

Technical Details

The model was trained using the HelpSteer3-Preference dataset, which includes human-annotated preferences. It is optimized for NVIDIA GPU-accelerated systems and supports NVIDIA Ampere, Hopper, and Turing microarchitectures.

Overview

Model Overview

Key Capabilities & Performance

Use Cases

Technical Details

Full Model Card (README)