MediLlama-3.2: A Specialized Medical LLM

MediLlama-3.2 is a 3.2 billion parameter instruction-tuned model developed by InferenceLab, based on Meta's LLaMA 3.2 3B Instruct. It has been extensively fine-tuned using Supervised Fine-Tuning (SFT) on diverse medical datasets, including cleaned medical QA pairs, synthetic doctor-patient conversations, and public health forums. This specialization allows it to handle complex English-language healthcare scenarios, such as diagnostic queries, treatment suggestions, and general medical advice, with a 32768-token context window.

Key Capabilities

Medical Q&A: Provides informed answers to health-related questions.
Symptom Checking: Assists in initial symptom triage and understanding.
Patient Education: Generates educational content and explanations for medical conditions.
Domain-Adapted: Optimized for healthcare and medical applications, distinguishing it from general-purpose LLMs.

Performance & Training

The model achieved an accuracy of 81.3% on unseen medical QA pairs, with a BLEU score of 34.5 and ROUGE-L of 62.2. Training involved approximately 12 hours on 4 NVIDIA A100 GPUs, utilizing bf16 mixed precision and a learning rate of 1e-5. Protected health information (PHI) was rigorously removed from all training data.

Ideal Use Cases

Direct Use: Functions as a medical chatbot or virtual assistant for educational content and initial health inquiries.
Downstream Integration: Can be integrated into telehealth systems, clinical documentation tools, or diagnostic assistants after further task-specific fine-tuning.

Important Considerations

MediLlama-3.2 is intended for research and prototyping. It should not be used for real-time diagnosis, treatment decisions, or high-risk emergency response without validation by certified medical professionals. Users must be aware of potential biases, hallucinations, or outdated advice, and outputs should always be cross-referenced with expert medical opinion.