AdaptLLM/medicine-LLM-13B

TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kPublished:Dec 19, 2023License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

AdaptLLM/medicine-LLM-13B is a 13 billion parameter language model developed by AdaptLLM, based on the LLaMA-1 architecture. It is continually pre-trained on domain-specific biomedical corpora using a novel reading comprehension method to enhance domain knowledge while preserving question-answering abilities. This model is specifically optimized for tasks within the biomedicine domain, demonstrating strong performance comparable to much larger domain-specific models.

Loading preview...

AdaptLLM/medicine-LLM-13B: Domain Adaptation for Biomedicine

AdaptLLM/medicine-LLM-13B is a 13 billion parameter language model derived from LLaMA-1-13B, developed by AdaptLLM. This model is a result of research presented at ICLR 2024, focusing on adapting large language models to specific domains via continual pre-training.

Key Capabilities & Innovations

  • Domain-Specific Knowledge: Enriched with extensive biomedical knowledge through continued pre-training on relevant corpora.
  • Reading Comprehension Method: Utilizes a unique method to transform large-scale pre-training corpora into reading comprehension texts, which effectively enriches domain knowledge without degrading general question-answering performance.
  • Performance: The underlying AdaptLLM method has shown that even 7B parameter models can compete with significantly larger domain-specific models like BloombergGPT-50B, indicating strong efficiency and effectiveness.
  • Scalability: The method has been proven effective for larger models, with this 13B version consistently showing positive results.

Use Cases & Evaluation

This model is particularly well-suited for applications requiring deep understanding and generation within the biomedicine domain. AdaptLLM provides pre-templatized testing splits and evaluation scripts to facilitate benchmarking on domain-specific tasks. Developers can use this model for tasks such as medical question answering, information extraction from biomedical texts, and other domain-specific natural language processing challenges.