Name: TIGER-Lab/TIGERScore-7B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: TIGER-Lab

TIGERScore-7B: Explainable, Reference-Free Text Generation Evaluation

TIGERScore-7B, developed by TIGER-Lab, is a 7 billion parameter model built on LLaMA-2, specifically designed to evaluate text generation tasks without needing a reference. It addresses common limitations of existing metrics, such as reliance on references, domain specificity, and lack of attribution, by providing detailed, instruction-guided error analysis.

Key Capabilities

Reference-Free Evaluation: Assesses generated text quality without requiring a ground-truth reference.
Explainable Error Analysis: Pinpoints errors with specific locations, aspects, explanations, and penalty scores.
Universal Applicability: Trained on the comprehensive MetricInstruct dataset, covering 6 diverse text generation tasks and 23 datasets, enabling broad use across various text generation scenarios.
High Correlation with Human Ratings: Achieves superior correlation with human judgments compared to many existing reference-based and reference-free metrics, as demonstrated in Kendall, Pearson, and Spearman evaluations.

Good For

Developers and researchers needing an automated, interpretable metric for evaluating LLM outputs.
Tasks requiring detailed feedback on generated text quality, beyond a single score.
Evaluating text generation models across summarization, translation, data-to-text, long-form QA, MathQA, instruction following, and story generation.
Situations where ground-truth references are unavailable or difficult to obtain.

Overview

TIGERScore-7B: Explainable, Reference-Free Text Generation Evaluation

Key Capabilities

Good For

Full Model Card (README)