starmpcc/Asclepius-Llama3-8B
Asclepius-Llama3-8B by starmpcc is an 8 billion parameter clinical large language model, fine-tuned from Llama-3 with an increased context length of 8192 tokens. This model specializes in processing clinical notes and is designed to perform various clinical NLP tasks such as Named Entity Recognition, summarization, and question answering. It is an enhanced version of Asclepius-7B, specifically optimized for research in clinical applications.
Loading preview...
Asclepius-Llama3-8B: A Clinical LLM
Asclepius-Llama3-8B, developed by starmpcc, is an 8 billion parameter clinical large language model built upon the Llama-3 architecture. It is an enhanced iteration of Asclepius-7B, featuring an extended maximum sequence length of 8192 tokens. The model was initially trained using causal language modeling on synthetic clinical notes and subsequently fine-tuned with clinical instruction-response pairs.
Key Capabilities
This model is designed to perform a range of clinical NLP tasks using clinical notes, including:
- Named Entity Recognition
- Abbreviation Expansion
- Relation Extraction
- Temporal Information Extraction
- Coreference Resolution
- Paraphrasing
- Summarization
- Question Answering
Training Details
The training involved pre-training for approximately 3 hours and instruction fine-tuning for over 30 hours, both utilizing 4x A100 80G GPUs. The training procedure followed configurations similar to Stanford Alpaca. A variant, Asclepius-R, trained on MIMIC-III discharge summaries, is also available.
Intended Use
Asclepius-Llama3-8B is intended solely for research purposes in clinical NLP. Its specialized training on clinical data makes it suitable for tasks requiring deep understanding and generation within the medical domain.