unidocs/llama-3.1-8b-komedic-instruct
The unidocs/llama-3.1-8b-komedic-instruct is an 8 billion parameter LLaMA 3.1-based instruction-tuned model developed by Unidocs, specifically fine-tuned for healthcare-related queries. It leverages a proprietary healthcare dataset, including AIHub's super-large AI healthcare Q&A data and medical/legal professional book corpora, to enhance its performance in medical contexts. This model is designed to assist with healthcare tasks, demonstrating an average MMLU accuracy of 0.72 across medical categories.
Loading preview...
Overview
This model, unidocs/llama-3.1-8b-komedic-instruct, is an 8 billion parameter LLaMA 3.1-based instruction-tuned model developed by Unidocs. It was released on October 16, 2024, as part of the AIDC-HPC project and is utilized in Unidocs' ezMyAIDoctor. The model has undergone full fine-tuning (Pretrain) using a specialized healthcare dataset, building upon the meta-llama/Llama-3.1-8B-Instruct base.
Key Capabilities
- Healthcare Specialization: Fine-tuned with extensive healthcare data, including AIHub's super-large AI healthcare question-answer data, an improved Korean performance corpus, and medical/legal professional book corpora.
- Instruction Following: Designed to respond to healthcare-related queries and tasks effectively.
- Performance: Achieves an average MMLU accuracy of 0.72 across various medical categories, including anatomy (0.68), clinical knowledge (0.75), college medicine (0.68), medical genetics (0.70), and professional medicine (0.76).
Intended Uses & Limitations
This model is intended to assist with healthcare-related queries and tasks. However, it is crucial to understand its limitations:
- Not a Medical Professional: It should not be used as a substitute for professional medical advice, diagnosis, or treatment. Always consult qualified healthcare providers for medical concerns.
- Potential for Bias/Inaccuracy: The model may produce biased or inaccurate results and should not be solely relied upon for critical healthcare decisions.
- Data-Limited Knowledge: Its knowledge is restricted to its training data and cut-off date, and it may exhibit biases present in that data.
Training Data
The model was fine-tuned on a proprietary healthcare dataset, which includes:
wikiandkowikidata.- AIHub's "super-large AI healthcare question-answer data."
- AIHub's "super-large AI corpus with improved Korean performance."
- AIHub's "medical and legal professional book corpus."