Name: rishanthrajendhran/VeriFastScore API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: rishanthrajendhran

Overview

VeriFastScore is an 8 billion parameter LLaMA 3.1 Instruct model, fine-tuned by NGRAM at UMD and Lambda Labs, specifically for evaluating the factuality of long-form LLM generated text. Unlike multi-step pipeline evaluators, this model performs joint claim extraction and verification in a single inference pass, significantly reducing latency and computational cost while maintaining high agreement with more expensive methods like VeriScore.

Key Capabilities

Joint Claim Extraction and Verification: Extracts fine-grained, verifiable claims from long-form responses and simultaneously labels them as 'Supported' or 'Unsupported' based on provided evidence.
Reduced Latency and Cost: Designed to offer a faster and more economical solution for factuality assessment compared to traditional pipeline-based approaches.
High Correlation with Baselines: Achieves strong Pearson correlation (0.86 with claim-level evidence, 0.80 with sentence-level evidence) with VeriScore, a robust multi-step baseline.
System-Level Benchmarking: Demonstrates a system-level Pearson correlation of 0.94 with VeriScore for model rankings, providing a 6.6x speedup (9.9x excluding retrieval).
Input Flexibility: Takes a generated long-form response and a consolidated set of retrieved evidence sentences as input.

Good For

Factuality Evaluation Pipelines: Ideal for integrating into automated evaluation systems, such as those used for RLHF supervision.
Dataset Filtering: Can be used to filter datasets based on the factual accuracy of generated content.
LLM System Benchmarking: Suitable for benchmarking the factuality performance of various large language models at a system level.
Cost-Sensitive Applications: Provides a more efficient alternative for large-scale factuality assessment where cost and speed are critical factors.

Overview

Overview

Key Capabilities

Good For

Full Model Card (README)