prometheus-eval/prometheus-7b-v2.0
Prometheus 2 (prometheus-eval/prometheus-7b-v2.0) is a 7 billion parameter language model developed by prometheus-eval, built upon the Mistral-Instruct base architecture with a 4096 token context length. It is specifically fine-tuned on 100K feedback and 200K preference data for fine-grained evaluation of other LLMs and functions as a reward model for RLHF. This model uniquely supports both absolute grading (direct assessment) and relative grading (pairwise ranking) through weight merging, making it an alternative to GPT-4 for detailed LLM evaluation.
Loading preview...
Prometheus 2: A Specialized LLM Evaluator
Prometheus 2 is a 7 billion parameter language model, based on Mistral-Instruct, designed for the fine-grained evaluation of other Large Language Models (LLMs) and as a Reward Model for Reinforcement Learning from Human Feedback (RLHF). It offers a specialized alternative to general-purpose models like GPT-4 for evaluation tasks.
Key Capabilities & Features
- Dual Grading Formats: Supports both absolute grading (direct assessment with a 1-5 score) and relative grading (pairwise ranking of two responses).
- Weight Merging: Utilizes a novel weight merging technique to integrate capabilities for both absolute and relative grading, surprisingly improving performance on each format.
- Specialized Training: Fine-tuned on a substantial dataset including 100K feedback entries from the Feedback Collection and 200K preference entries from the Preference Collection.
- Prompt Format Guidance: Provides specific prompt templates for both absolute and relative grading, requiring components like instruction, response(s), reference answer, and score rubrics.
When to Use Prometheus 2
- Evaluating LLM Outputs: Ideal for developers and researchers needing detailed, objective assessments of LLM responses.
- RLHF Applications: Suitable for generating reward signals in Reinforcement Learning from Human Feedback pipelines.
- Fine-grained Analysis: When a simple pass/fail or general score isn't enough, and detailed feedback based on specific criteria is required.
Prometheus 2 is an open-source model, with its research detailed in the paper "Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models" (arXiv:2405.01535).
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.