Med-V1-L3B: Small Language Model for Biomedical Evidence Attribution
Med-V1-L3B is a 3.2 billion parameter language model developed by ncbi, specifically designed for efficient and accurate biomedical evidence attribution. Fine-tuned from Llama-3.2-3B-Instruct using a high-quality synthetic dataset called MedFact-Synth, this model excels at assessing whether a given scientific article supports or refutes a biomedical claim. It operates with a 32768 token context length, allowing for comprehensive analysis of source texts.
Key Capabilities
- Biomedical Evidence Attribution: Classifies the degree of agreement or contradiction between an assertion and a source article using a five-point scale (-2 to +2).
- Hallucination Detection: Can be used to quantify hallucinations in LLM-generated answers by verifying cited claims against their sources.
- Misattribution Identification: Capable of identifying high-stakes misattributions in documents like clinical guidelines.
- Cost-Effective: Offers performance comparable to frontier LLMs (e.g., GPT-5, GPT-4o) for its specialized task, but with significantly lower deployment costs due to its smaller size.
- High-Quality Explanations: Provides detailed, step-by-step explanations for its scoring decisions.
Good For
- Researchers and developers needing to automate the verification of biomedical claims against scientific literature.
- Applications requiring scalable hallucination detection in LLM outputs within the biomedical domain.
- Identifying potential misattributions in medical texts and clinical guidelines.
- Anyone seeking an efficient, lightweight alternative to large, expensive LLMs for specialized biomedical fact-checking and evidence analysis.