Starling-LM-7B-alpha
Llama-3.1-Nemotron-70B-Reward-HF
ctx-bird-reward-250121
Qwen3-Nemotron-8B-BRRM
Qwen3-Nemotron-14B-BRRM
PaTaRM-8B
PaTaRM-14B
RewardAnything-8B-v1
sycofact
JSL-MedMNX-7B-v2.0
Storm-7B
Starling-LM-7B-beta
ToolRM-Gen-Qwen3-4B-Thinking-2507
Starling-LM-7B-beta-laser-dpo
karma-electric-llama31-8b
ThinkPRM-1.5B
ThinkPRM-7B
ThinkPRM-14B
WebArbiter-7B
IntelliAsk-Qwen3-32B-450-Merged
IF-Verifier-7B
WebArbiter-8B-Qwen3
WebArbiter-3B
SciRM-Ref-7B
gPRM-14B-merged
R-PRM-7B-DPO
SciRM-7B
WebArbiter-4B-Qwen3
Llama-3.1-8B-FoVer-PRM-2026
Llama-3.1-8B-FoVer-PRM-old
Qwen-2.5-7B-FoVer-PRM-2026
gORM-14B-merged
Multiclass-Think-RM-8B
Qwen2.5-Math-1.5B-Scoring-Mean
SOLE-R1-8B