deep-div/MediLlama-3.2

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.2BQuant:BF16Ctx Length:32kPublished:May 16, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

deep-div/MediLlama-3.2 is a 3.2 billion parameter, 32768-token context length causal language model developed by InferenceLab. Fine-tuned from Meta's LLaMA 3.2 3B Instruct, it is specifically optimized for English-language medical and healthcare applications. This model excels at tasks like medical Q&A, symptom checking, and patient education, serving as a specialized medical chatbot.

Loading preview...

MediLlama-3.2: A Specialized Medical LLM

MediLlama-3.2 is a 3.2 billion parameter instruction-tuned model developed by InferenceLab, based on Meta's LLaMA 3.2 3B Instruct. It has been extensively fine-tuned using Supervised Fine-Tuning (SFT) on diverse medical datasets, including cleaned medical QA pairs, synthetic doctor-patient conversations, and public health forums. This specialization allows it to handle complex English-language healthcare scenarios, such as diagnostic queries, treatment suggestions, and general medical advice, with a 32768-token context window.

Key Capabilities

  • Medical Q&A: Provides informed answers to health-related questions.
  • Symptom Checking: Assists in initial symptom triage and understanding.
  • Patient Education: Generates educational content and explanations for medical conditions.
  • Domain-Adapted: Optimized for healthcare and medical applications, distinguishing it from general-purpose LLMs.

Performance & Training

The model achieved an accuracy of 81.3% on unseen medical QA pairs, with a BLEU score of 34.5 and ROUGE-L of 62.2. Training involved approximately 12 hours on 4 NVIDIA A100 GPUs, utilizing bf16 mixed precision and a learning rate of 1e-5. Protected health information (PHI) was rigorously removed from all training data.

Ideal Use Cases

  • Direct Use: Functions as a medical chatbot or virtual assistant for educational content and initial health inquiries.
  • Downstream Integration: Can be integrated into telehealth systems, clinical documentation tools, or diagnostic assistants after further task-specific fine-tuning.

Important Considerations

MediLlama-3.2 is intended for research and prototyping. It should not be used for real-time diagnosis, treatment decisions, or high-risk emergency response without validation by certified medical professionals. Users must be aware of potential biases, hallucinations, or outdated advice, and outputs should always be cross-referenced with expert medical opinion.