Overview
This model, unidocs/llama-3.1-8b-komedic-instruct, is an 8 billion parameter LLaMA 3.1-based instruction-tuned model developed by Unidocs. It was released on October 16, 2024, as part of the AIDC-HPC project and is utilized in Unidocs' ezMyAIDoctor. The model has undergone full fine-tuning (Pretrain) using a specialized healthcare dataset, building upon the meta-llama/Llama-3.1-8B-Instruct base.
Key Capabilities
- Healthcare Specialization: Fine-tuned with extensive healthcare data, including AIHub's super-large AI healthcare question-answer data, an improved Korean performance corpus, and medical/legal professional book corpora.
- Instruction Following: Designed to respond to healthcare-related queries and tasks effectively.
- Performance: Achieves an average MMLU accuracy of 0.72 across various medical categories, including anatomy (0.68), clinical knowledge (0.75), college medicine (0.68), medical genetics (0.70), and professional medicine (0.76).
Intended Uses & Limitations
This model is intended to assist with healthcare-related queries and tasks. However, it is crucial to understand its limitations:
- Not a Medical Professional: It should not be used as a substitute for professional medical advice, diagnosis, or treatment. Always consult qualified healthcare providers for medical concerns.
- Potential for Bias/Inaccuracy: The model may produce biased or inaccurate results and should not be solely relied upon for critical healthcare decisions.
- Data-Limited Knowledge: Its knowledge is restricted to its training data and cut-off date, and it may exhibit biases present in that data.
Training Data
The model was fine-tuned on a proprietary healthcare dataset, which includes:
wiki and kowiki data.- AIHub's "super-large AI healthcare question-answer data."
- AIHub's "super-large AI corpus with improved Korean performance."
- AIHub's "medical and legal professional book corpus."