BioMedGPT-LM-7B Overview
BioMedGPT-LM-7B, developed by PharMolix, is the first large generative language model based on Llama2 designed for the biomedical domain. This 7 billion parameter model was fine-tuned from Llama2-7B-Chat using an extensive dataset of over 26 billion tokens extracted from millions of biomedical papers within the S2ORC corpus.
Key Capabilities
- Biomedical Expertise: Specialized in understanding and generating content relevant to biomedicine.
- High Performance: Achieves competitive or superior results compared to human performance and significantly larger general-purpose models on various biomedical QA benchmarks.
- Foundation for Multimodal AI: Serves as the language model component for BioMedGPT-10B, an open multimodal generative pre-trained transformer for biomedicine.
Training Details
The model underwent 5 epochs of fine-tuning with a batch size of 192, a context length of 2048, and a learning rate of 2e-5. The training data was meticulously selected from PubMed Central (PMC)-ID and PubMed ID criteria within the S2ORC corpus.
Good For
- Biomedical question answering.
- Research and development in biomedical natural language processing.
- Applications requiring deep understanding of biomedical literature.