Llama-3-OffsetBias-8B: A Debiased Generative Judge Model
NCSOFT/Llama-3-OffsetBias-8B is an 8 billion parameter generative judge model developed by NC Research, fine-tuned from Meta Llama-3-8B-Instruct. Its primary purpose is to perform pairwise preference evaluation, acting as a robust evaluator that mitigates common biases found in other evaluation models. This model was introduced in the paper "OffsetBias: Leveraging Debiased Data for Tuning Evaluators" and is designed to select the superior output between two options for a given instruction.
Key Capabilities & Features
- Bias Robustness: Specifically trained to be more resilient to various evaluation biases, ensuring fairer and more objective assessments.
- Pairwise Preference Evaluation: Given an instruction and two potential outputs (a) and (b), the model predicts which output is better.
- Instruction-Tuned: Fine-tuned on a diverse set of datasets including UltraFeedback, HelpSteer, hh-rlhf, PKU-SafeRLHF, and the proprietary NCSOFT/offsetbias dataset.
- Specific Prompt Format: Requires a precise prompt template for optimal performance, ensuring consistent and accurate evaluation.
Use Cases
- Automated Model Evaluation: Ideal for developers and researchers needing an objective and debiased method to compare the quality of responses from different AI models.
- Quality Assurance: Can be integrated into pipelines to automatically identify preferred outputs based on specific criteria.
- Research on Evaluation Biases: Useful for studying and mitigating biases in LLM evaluation processes.
Evaluation results from LLMBar and EvalBiasBench demonstrate its effectiveness in various evaluation metrics, particularly its strong performance in mitigating biases related to output length, concreteness, and empty references.