Name: unsloth/medgemma-4b-it API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: unsloth

MedGemma 4B Instruction-Tuned Model

MedGemma 4B IT is a 4.3 billion parameter multimodal model from Google, built upon the Gemma 3 architecture and specifically optimized for healthcare AI applications. It integrates a SigLIP image encoder, pre-trained on a wide array of de-identified medical images including chest X-rays, dermatology, ophthalmology, and histopathology, alongside an LLM component trained on diverse medical text and question-answer pairs.

Key Capabilities

Multimodal Medical Comprehension: Processes both medical text and images (normalized to 896x896 resolution) to generate text outputs.
Specialized Medical Training: Significantly outperforms base Gemma 3 4B on medical image classification, visual question answering, and text-only medical benchmarks.
Report Generation: Demonstrates strong performance in generating chest X-ray reports, with fine-tuning capabilities to improve accuracy against specific ground truths.
Long Context Support: Supports a context length of at least 128K tokens for comprehensive input.

Good For

Developing Healthcare AI Applications: Serves as an efficient starting point for applications requiring medical text and image understanding.
Medical Text Generation: Ideal for tasks involving generating text responses, analyses, or summaries from medical inputs.
Visual Question Answering: Excels at answering questions based on medical images across various modalities.
Fine-tuning: Designed to be fine-tuned by developers with proprietary data for specific medical tasks, offering strong baseline performance for adaptation.