ilgee/Multiclass-Think-RM-8B
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:May 8, 2025License:llama3.1Architecture:Transformer Cold

ilgee/Multiclass-Think-RM-8B is an 8 billion parameter generative reward model developed by Ilgee Hong et al. It is fine-tuned from Llama-3.1-8B-Instruct and features a unique internal thinking process for long-horizon reasoning, enabling more nuanced and interpretable preference judgments. This model excels at evaluating complex, reasoning-intensive tasks by providing multiclass preference scores from -3 to 3.

Loading preview...