EPFLiGHT/Apertus-70B-MeditronFO
EPFLiGHT/Apertus-70B-MeditronFO is a 70-billion parameter medical specialist LLM developed by EPFLiGHT, fine-tuned from Apertus-70B-Instruct on the Fully Open Meditron Corpus. This model is part of the Fully Open Meditron family, an end-to-end auditable pipeline for clinical LLMs with open weights, data, and training. It excels in medical question-answering benchmarks, establishing a new state of the art among fully open medical LLMs, and is primarily intended for research on medical LLMs and auditing clinical AI systems.
Loading preview...
Apertus-70B-MeditronFO: A Fully Open Medical LLM
Apertus-70B-MeditronFO is a 70-billion parameter medical specialist Large Language Model developed by EPFLiGHT. It is built upon the Apertus-70B-Instruct base model and fine-tuned using the Fully Open Meditron Corpus, a clinician-vetted dataset. This model is a key component of the Fully Open Meditron initiative, which emphasizes an end-to-end auditable pipeline for clinical LLMs, featuring open weights, data, and training methodologies.
Key Capabilities
- Medical Specialization: Achieves strong performance across standard medical benchmarks, including MedMCQA, MedQA, PubMedQA, MedXpertQA, and HealthBench Hard, with an average accuracy increase of 6.53% over its base model.
- Auditable Pipeline: Part of the first fully open and auditable pipeline for clinical LLMs, ensuring transparency in its development from data to training.
- Research-Oriented: Designed to support research in medical LLMs, facilitate auditing of clinical AI systems, and ensure reproducibility of the Fully Open Meditron pipeline.
Good for
- Medical LLM Research: Ideal for researchers exploring advancements in medical AI and language models.
- Clinical AI Auditing: Useful for evaluating and auditing the performance and biases of clinical AI systems.
- Reproducibility Studies: Supports efforts to reproduce and verify results within the Fully Open Meditron framework.
Note: This model is intended for research purposes only and is not validated for clinical deployment, individual patient advice, autonomous decision-making, or any other deployment-adjacent use without independent domain-specific safety evaluation.