Name: Yale-BIDS-Chen/Llama-3.1-8B-Evidence-Filtering API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Yale-BIDS-Chen

Yale-BIDS-Chen/Llama-3.1-8B-Evidence-Filtering: Medical Evidence Classification

This model is a specialized fine-tune of the Llama-3.1-8B-Instruct architecture, developed by Yale-BIDS-Chen, with a focus on evidence relevance classification within medical Retrieval-Augmented Generation (RAG) systems. Its primary function is to act as a lightweight classifier, assessing whether a candidate passage provides supporting evidence for a given clinical query.

Key Capabilities & Features

Medical Evidence Filtering: Classifies passages as "Yes" (contains supporting evidence) or "No" (does not contain supporting evidence) for a clinical query.
Improved RAG Quality: Designed to enhance the reliability and interpretability of medical RAG pipelines by filtering out irrelevant information before text generation.
Fine-tuned Performance: Achieves an F1 score of 0.623 on expert-annotated medical query-passage pairs, demonstrating substantial gains over zero-shot baselines like Llama-3.1-8B (0.521 F1) and GPT-4o (0.442 F1).
Training Data: Trained on 3,200 expert-labeled query-passage pairs, focusing on the specific task of evidence classification.

Intended Use Cases

Medical RAG Systems: Ideal for integration into medical question-answering systems to pre-filter retrieved documents.
Research Purposes: Intended for researchers working on improving the accuracy and efficiency of information retrieval in clinical contexts.
Building Interpretable AI: Contributes to more transparent RAG pipelines by explicitly identifying relevant evidence.

For detailed methodology and experimental results, refer to the associated paper: Rethinking Retrieval-Augmented Generation for Medicine: A Large-Scale, Systematic Expert Evaluation and Practical Insights.

Overview

Yale-BIDS-Chen/Llama-3.1-8B-Evidence-Filtering: Medical Evidence Classification

Key Capabilities & Features

Intended Use Cases

Full Model Card (README)