Name: MohamedAhmedAE/distil_med42_8B_Llama-3.2-1B-Instruct API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: MohamedAhmedAE

Overview

MohamedAhmedAE/distil_med42_8B_Llama-3.2-1B-Instruct is a 1.24 billion parameter instruction-tuned medical chat model, developed by Mohamed Abo El-Enen et al. and published in the 2025 IEEE ICICIS conference. It is a distilled version of the 8.03 billion parameter Llama3-Med42-8B teacher model, built upon a Llama-3.2-1B-Instruct student base. This model is designed as a compact medical assistant, maintaining a usable chat template for conversational prompting.

Key Capabilities & Distillation Method

Knowledge Distillation: Utilizes temperature-scaled KL-divergence, specialty-weighted losses, and attention-map alignment to transfer medical expertise from a larger teacher model.
Efficiency: Achieves 89.3% of the teacher's token-level accuracy while reducing parameters by 75%, resulting in a lightweight model suitable for resource-constrained environments.
Medical Specialization: Trained on a unified corpus of 18 medical benchmarks (1.64M samples), including MMLU medical subtasks, PubMedQA, and clinical dialogues.
Performance: Reaches 47.7% average accuracy on MMLU-Medical, representing a 20.5% relative improvement over the base LLaMA 3.2-1B, with an inference speed of 59.5 tokens/sec.

Intended Use & Limitations

Use Cases: Ideal for research in efficient medical chat assistants, knowledge-distillation studies, and edge/low-resource deployment experiments.
Critical Limitation: The model is not a certified clinical tool. Expert review found critical factual errors in approximately 21% of sampled answers, necessitating qualified human oversight for any output. It should be treated as a research checkpoint, not a production-ready diagnostic tool.

Overview

Overview

Key Capabilities & Distillation Method

Intended Use & Limitations

Full Model Card (README)