Overview
somosnlp/Sam_Diagnostic is a specialized AI model built upon the Gemma-2B-IT architecture, meticulously fine-tuned for medical diagnostic applications. Developed by NickyNicky, this model leverages a carefully curated and cleaned dataset of medical transcriptions, initially in English from Kaggle, which were subsequently translated into Spanish using AI technologies like ChatGPT to broaden its applicability.
Key Capabilities
- Medical Transcription Processing: Efficiently processes and understands medical information from patient case transcriptions.
- Multilingual Support: Overcomes language barriers by translating medical data into Spanish, making it suitable for a wider range of medical contexts.
- Diagnostic Assistance: Provides a detailed summary of patient cases, identifies the most relevant medical specialty, and offers a principal diagnosis.
- High Accuracy: Achieved 80% accuracy on new, unseen medical data and 95% accuracy on its training data, indicating strong learning and adaptation capabilities.
Training and Data
The model's development involved rigorous data cleaning to address missing values, followed by transforming the transcriptions into a ChatML format for efficient training. The base model used for fine-tuning was google/gemma-2b-it.
Potential Use Cases
This model is designed to assist medical professionals by streamlining diagnostic processes, ensuring patients are directed to the correct specialists, and providing quick, accurate insights into patient conditions. It aims to enhance medical assistance through faster diagnoses and increased personalization.