OpenRubrics/RubricRM-8B-Judge

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Oct 9, 2025Architecture:Transformer Cold

OpenRubrics/RubricRM-8B-Judge is an 8 billion parameter language model, fine-tuned from Qwen3/Qwen3-8B, specifically designed for impartial evaluation of AI responses. This model acts as a judge, assessing 'Response A' and 'Response B' against a given instruction and a detailed rubric. It excels at structured evaluation, identifying gatekeeper criteria, and providing justified decisions based on a multi-phase analysis process.

Loading preview...

OpenRubrics/RubricRM-8B-Judge Overview

OpenRubrics/RubricRM-8B-Judge is an 8 billion parameter model, fine-tuned from the Qwen3/Qwen3-8B architecture, specialized in acting as an impartial judge for evaluating AI-generated responses. Its core function is to compare two responses ('Response A' and 'Response B') against a provided instruction and a detailed rubric.

Key Capabilities

  • Structured Evaluation: The model follows a multi-phase evaluation process, starting with a compliance check based on objective 'Gatekeeper Criteria' from the rubric.
  • Detailed Analysis: It performs an item-by-item analysis of each response against all rubric criteria, distinguishing between objective (Hard Rule) and subjective (Principle) criteria.
  • Impartial Judgment: The model is designed to provide a final, justified decision on which response is superior, adhering to a strict output format.
  • Rubric-Driven Assessment: It requires a RubricRM-Rubric to guide its evaluation, ensuring consistency and fairness.

Usage and Integration

Developers can integrate this model using the Hugging Face transformers library. It requires a specific prompt template that includes the instruction, rubric, and the two responses to be evaluated. The model's output format is strictly defined, ensuring parseable and consistent evaluation results. This model is particularly useful for automated quality assurance, reward modeling, and LLM alignment tasks where objective and structured evaluation is critical.