Name: Salesforce/FARE-8B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Salesforce

Model Overview

Salesforce/FARE-8B is an 8 billion parameter multi-task generative evaluator model, developed by Austin Xu, Xuan-Phi Nguyen, Yilun Zhou, Chien-Sheng Wu, Caiming Xiong, and Shafiq Joty. It is fine-tuned from the Qwen-8B base model and designed for automated evaluation in reasoning-centric domains. The model leverages large-scale multi-task, multi-domain data mixtures and rejection-sampling SFT to achieve its specialized evaluation capabilities.

Key Capabilities

Multi-task Evaluation: Performs a range of evaluation tasks, including pairwise comparisons, step-level error identification, reference-based verification, reference-free verification, and single-rating assessment.
Reasoning-Centric: Optimized for evaluating responses in domains requiring strong reasoning, as detailed in its accompanying paper.
Prompt-Template Driven: Designed to be used with specific system and user prompt templates for each evaluation task, ensuring consistent and accurate assessments.
High Context Length: Supports a context length of 32768 tokens, allowing for evaluation of longer responses or complex interactions.

Good For

Automated AI Assistant Evaluation: Ideal for developers and researchers looking to automatically assess the quality and correctness of AI model outputs.
Comparative Analysis: Excels at pairwise comparisons to determine which of two AI responses is superior.
Error Identification: Capable of pinpointing specific errors at the step-level within a multi-step solution, particularly useful for mathematical or logical reasoning tasks.
Research in Generative Evaluators: Serves as a foundational model for further research into scaling multi-task generative evaluator training.

Overview

Model Overview

Key Capabilities

Good For

Full Model Card (README)