microsoft/MediPhi

Warm
Public
4B
BF16
4096
License: mit
Hugging Face
Overview

What is microsoft/MediPhi?

microsoft/MediPhi is a 3.8 billion parameter small language model (SLM) developed by Microsoft Healthcare & Life Sciences, built upon the Phi-3.5-mini-instruct base model. It is specifically designed for medical and clinical natural language processing (NLP) tasks. The model was created through a modular approach, initially fine-tuning five expert models on distinct medical corpora (PubMed, Medical Wikipedia, Medical Guidelines, Medical Coding, and open-source clinical documents). These experts were then merged using the BreadCrumbs technique to form the unified MediPhi model, which retains general abilities while gaining specialized medical knowledge.

Key Capabilities

  • Medical Domain Specialization: Excels in understanding and processing medical and clinical language, adapted from a general-purpose SLM.
  • Modular Architecture: Benefits from a unique merging strategy that combines specialized knowledge from various medical sub-domains.
  • Resource Efficiency: Designed for use in memory/compute-constrained and latency-bound environments, making it suitable for edge or real-time clinical applications.
  • Research Acceleration: Intended to accelerate research in clinical NLP, offering a robust base for benchmarking and further development.
  • Safety & Alignment: Demonstrates conservation of base model safety capabilities, including resistance to jailbreaking and harmfulness, and improved groundedness, as evaluated by Medical Red Teaming Protocols.

Good For

  • Clinical NLP Research: Ideal for researchers working on language models in medical and clinical scenarios.
  • Benchmarking: Suitable for evaluating performance in medical NLP tasks, especially with the extended CLUE+ benchmark.
  • Applications requiring medically adapted language models: Where general LLMs may lack the necessary domain-specific understanding.
  • Environments with limited computational resources: Its small size makes it efficient for deployment where larger models are impractical.

It is important to note that while powerful, MediPhi is intended for research and requires expert user verification of outputs, especially in high-risk scenarios, and adherence to responsible AI practices.