Name: PharMolix/BioMedGPT-LM-7B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: PharMolix

BioMedGPT-LM-7B: A Specialized Biomedical Language Model

BioMedGPT-LM-7B, developed by PharMolix, is the first large generative language model based on Llama2 specifically fine-tuned for the biomedical domain. It leverages the Llama2-7B-Chat architecture and has been extensively trained on over 26 billion tokens derived from millions of biomedical papers within the S2ORC corpus.

Key Capabilities and Features

Biomedical Specialization: Fine-tuned on a vast dataset of biomedical literature, making it highly proficient in domain-specific language and knowledge.
High Performance on QA: Demonstrates performance on par with or superior to human experts and larger general-purpose foundation models on various biomedical Question Answering benchmarks.
Foundation for Multimodal AI: Serves as the generative language model component of BioMedGPT-10B, an open multimodal generative pre-trained transformer that bridges natural language with diverse biomedical data modalities.

Training Details

The model underwent 5 epochs of fine-tuning with a batch size of 192, a context length of 2048 tokens, and a learning rate of 2e-5. The training data was meticulously extracted using PubMed Central (PMC)-ID and PubMed ID criteria.

Use Cases

BioMedGPT-LM-7B is ideal for applications requiring deep understanding and generation of biomedical text, such as:

Biomedical question answering systems.
Information extraction from scientific literature.
Assisting in research and development within the pharmaceutical and medical fields.

For more technical details, refer to the technical report on "BioMedGPT: Open Multimodal Generative Pre-trained Transformer for BioMedicine".

Overview

BioMedGPT-LM-7B: A Specialized Biomedical Language Model

Key Capabilities and Features

Training Details

Use Cases

Full Model Card (README)