BioMistral-7B-Synthetic-EHR Overview
This model, developed by abhishek-ch, is a 7 billion parameter language model based on the BioMistral/BioMistral-7B-DARE architecture, converted to the MLX format. It has been specifically fine-tuned using LoRA on two key datasets: health_facts and a Synthetic EHR dataset inspired by MIMIC-IV. This targeted training, conducted over 1000 steps (approximately 1 million tokens), enhances its performance in specialized healthcare-related natural language processing tasks.
Key Capabilities
- Clinical Note Analysis: Expertly provides diagnosis summaries based on clinical notes, including chief complaints, patient summaries, and medical admission details, drawing inspiration from the MIMIC-IV-Note dataset.
- Public Health Fact-Checking: Functions as a Public Health AI Assistant capable of fact-checking public health claims, categorizing answers as true, false, unproven, or mixture, and providing reasons for the assessment.
- MLX and Transformers Compatibility: Supports loading and generation using both
mlx-lm and transformers libraries, offering flexibility for developers.
Good for
- Developing AI assistants for clinical decision support or medical information retrieval.
- Applications requiring automated summarization of electronic health records (EHR).
- Building tools for verifying the accuracy of public health information and claims.
- Research and development in biomedical natural language processing, particularly with synthetic EHR data.