Overview
Overview
Unbabel/M-Prometheus-14B is a 14.8 billion parameter large language model specifically developed as an open LLM judge. Its primary function is to natively evaluate multilingual outputs, distinguishing it from general-purpose LLMs. The model was trained on an extensive dataset comprising 480,000 instances of multilingual direct assessment and pairwise comparison data, which included detailed long-form feedback.
Key Capabilities
- Multilingual Evaluation: Designed to assess text quality across multiple languages.
- Detailed Feedback Generation: Provides comprehensive feedback based on specific score rubrics, similar to Prometheus-2.
- Scoring System: Assigns an integer score (1-5) based on predefined criteria like Accuracy, Fluency, and Style.
- Machine Translation (MT) Evaluation: Optimized for direct-assessment MT evaluation, using a structured prompt format to guide its assessment.
Good For
- Automated quality assessment of multilingual text generation.
- Evaluating machine translation outputs against reference answers and detailed rubrics.
- Researchers and developers needing an open-source, specialized LLM for judging text quality in diverse languages.