Name: TIGER-Lab/TIGERScore-13B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: TIGER-Lab

TIGERScore-13B: An Explainable, Reference-Free Text Generation Metric

TIGERScore-13B, developed by TIGER-Lab, is a 13 billion parameter model based on LLaMA-2, specifically designed as a trained metric for evaluating text generation. Unlike traditional metrics that often rely on reference texts or are limited to specific domains, TIGERScore operates reference-free and provides explainable error analysis.

Key Capabilities

Instruction-Guided Evaluation: Evaluates text generation based on natural language instructions.
Detailed Error Analysis: Pinpoints errors in generated text by identifying location, aspect, explanation, and assigning penalty scores.
Reference-Free: Assesses quality without needing a ground-truth reference output.
Broad Task Coverage: Trained on the MetricInstruct dataset, covering 6 text generation tasks (e.g., Summarization, Translation, Data2Text, Long-form QA, MathQA, Instruction Following) and 23 datasets.
High Correlation with Human Ratings: Demonstrates superior correlation with human judgments compared to many existing metrics, both reference-based and reference-free, across various tasks.

Good For

Automated Evaluation of LLM Outputs: Ideal for developers and researchers needing to automatically assess the quality of text generated by large language models.
Debugging and Improving Text Generation: The detailed error explanations can help in understanding specific weaknesses in generation models and guide improvements.
Research in Text Evaluation: Offers a powerful, interpretable, and easy-to-use tool for advancing research in universal explainable metrics for text generation.

Overview

TIGERScore-13B: An Explainable, Reference-Free Text Generation Metric

Key Capabilities

Good For

Full Model Card (README)