Name: nvidia/Qwen3-Nemotron-235B-A22B-GenRM API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: nvidia

Model Overview

NVIDIA's Qwen3-Nemotron-235B-A22B-GenRM is a 235 billion parameter Generative Reward Model (GenRM) built on the Qwen3 architecture, specifically using the Qwen3-235B-A22B-Thinking-2507 foundation. Its primary function is to evaluate the quality of AI assistant responses by providing helpfulness scores for individual responses and a ranking score between two candidates, given a conversation history and a new user request. This model is integral to the Reinforcement Learning from Human Feedback (RLHF) training process, notably for models like NVIDIA-Nemotron-3-Nano-30B-A3B-BF16.

Key Capabilities

Response Evaluation: Assesses the quality of AI assistant responses, outputting individual helpfulness scores (1-5) and comparative ranking scores (1-6).
RLHF Integration: Designed to facilitate the fine-tuning of other language models through RLHF.
High Performance: Optimized for NVIDIA GPU-accelerated systems, leveraging hardware and software frameworks like CUDA for faster training and inference.
Extensive Context: Supports an input context of up to 128k tokens.

Performance Benchmarks

The model demonstrates strong performance across various evaluation suites:

RM-Bench: Achieves an 87.3 Overall score, with high scores in Math (96.9) and Safety (94.4).
JudgeBench: Scores an 87.4 Overall, with notable results in Reasoning (95.9) and Code (95.2).

Use Cases

This model is ideal for developers and researchers focused on:

Training Reward Models: Directly applicable for use in RLHF pipelines to improve the alignment and quality of generative AI models.
Automated Response Quality Assessment: Can be integrated into systems requiring automated evaluation of chatbot or assistant outputs.
Research in AI Alignment: Provides a robust tool for studying and implementing preference-based learning.

Overview

Model Overview

Key Capabilities

Performance Benchmarks

Use Cases

Full Model Card (README)