Overview
MediPhi-Instruct: A Specialized Clinical SLM
MediPhi-Instruct is a 3.8 billion parameter small language model (SLM) from Microsoft Healthcare & Life Sciences, built upon the Phi-3.5-mini-instruct base model. It is uniquely specialized for medical and clinical natural language processing (NLP) tasks through a modular training approach.
Key Capabilities
- Domain Expertise: Developed by merging five expert models, each fine-tuned on specific medical corpora including PubMed, Medical Wikipedia, Medical Guidelines, Medical Coding (ICD10CM, ICD10PROC, etc.), and open-source clinical documents.
- Clinical Alignment: Further aligned using the
microsoft/mediflowdataset, a synthetic collection of 2.5 million high-quality instructions across 14 medical NLP tasks. - Performance: Achieves an average score of 43.4% on the CLUE+ benchmark, demonstrating strong performance across various medical NLP tasks, with notable improvements in areas like RRS QA (61.6%) and SDoH (56.7%).
- Safety: Retains the safety capabilities of its base model,
Phi-3.5-mini-instruct, and shows improved groundedness, refusing or warning for nearly all harmful clinical and patient queries.
Good For
- Research in Clinical NLP: Ideal for accelerating research in medically adapted language models.
- Resource-Constrained Environments: Suitable for memory/compute constrained and latency-bound scenarios due to its small parameter count.
- Benchmarking: Can be used in benchmarking contexts for clinical NLP tasks.
- Expert User Verification: Intended for use where outputs can be verified by expert users, especially in high-risk scenarios.