Overview
Med42-v2: A Clinically-Aligned Llama3-70B Model
Med42-v2-70B is a 70 billion parameter large language model developed by M42 Health AI Team, specifically instruction and preference-tuned for medical applications. Built upon the Llama3 architecture, this model aims to expand access to medical knowledge through high-quality generative AI.
Key Capabilities & Performance:
- Superior Medical MCQA Performance: Outperforms GPT-4.0 in most multiple-choice question answering tasks.
- State-of-the-Art MedQA: Achieves a 79.10 zero-shot performance on MedQA, surpassing other openly available medical LLMs.
- Top Clinical Elo Rating: Ranks highest on the Clinical Elo Rating Leaderboard with a score of 1764, significantly outperforming Llama3-70B-Instruct and GPT-4o.
- Instruction-Tuned: Fine-tuned on approximately 1 billion tokens from diverse open-access medical sources, including flashcards, exam questions, and dialogues.
- 8K Context Length: Supports an 8192-token context window for processing medical text.
Intended Use Cases:
- Medical question answering
- Patient record summarization
- Aiding medical diagnosis
- General health Q&A
Important Limitations:
- Not for Clinical Use: The model is not ready for real clinical use and requires extensive human evaluation and safety testing.
- Potential for Harm: May generate incorrect or harmful information and carries a risk of perpetuating biases from training data.
For more details, refer to the research paper.