Edifon/SOAP_SFT_V1

VISIONConcurrency Cost:1Model Size:4.3BQuant:BF16Ctx Length:32kPublished:Mar 14, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

Edifon's SOAP_SFT_V1 is a 4.3 billion parameter causal language model, fine-tuned from Gemma 3 4B Instruct, specifically designed to generate structured medical SOAP notes (Subjective, Objective, Assessment, Plan) from doctor-patient dialogues. Optimized for clinical NLP research and healthcare professionals, it excels at converting consultation transcripts into the standardized SOAP format. The model was trained using Supervised Fine-Tuning with LoRA on a dataset of 9,250 medical dialogues, achieving stable convergence over 5 epochs.

Loading preview...

Edifon/SOAP_SFT_V1: Medical SOAP Note Generator

SOAP_SFT_V1 is a 4.3 billion parameter language model developed by Edifon, fine-tuned from unsloth/gemma-3-4b-it-unsloth-bnb-4bit. Its primary function is to automatically generate structured clinical SOAP notes (Subjective, Objective, Assessment, Plan) from doctor-patient dialogues, making it a specialized tool for clinical NLP.

Key Capabilities

  • Structured Medical Note Generation: Converts free-form clinical consultation transcripts into the standardized S, O, A, P format.
  • Fine-tuned Performance: Utilizes Supervised Fine-Tuning (SFT) with LoRA, targeting specific language layers for medical text generation.
  • Efficient Training: Trained on an H100 GPU using Unsloth and Hugging Face's TRL library, achieving 2x faster training.
  • Robust Training Data: Fine-tuned on the syafiqassegaf/soap-dataset comprising 9,250 examples of medical dialogues and corresponding SOAP notes.

Intended Use & Limitations

This model is designed to assist healthcare professionals and clinical NLP researchers by streamlining the creation of SOAP notes. It is not intended as a substitute for professional medical judgment. Limitations include being trained exclusively on English-language dialogues and potential performance degradation on highly specialized subspecialty consultations not well-represented in its training data. The model converged stably over 5 epochs, with a significant reduction in loss, indicating effective learning of the SOAP output format.