SympQwen-0.5B: Medical Symptom-to-Diagnosis Mapping Model

SympQwen-0.5B is a specialized 0.5 billion parameter language model, fine-tuned from the Qwen2.5-0.5B-Instruct architecture. Its primary function is to map medical symptoms described in patient-like language to plausible diagnoses, leveraging the gretelai/symptom_to_diagnosis dataset for its training. This model is equipped with a 32768 token context length, allowing for comprehensive symptom input.

Key Capabilities

Diagnostic Hypothesis Generation: Generates potential diagnostic hypotheses based on provided symptom descriptions.
Educational Aid: Assists medical students in developing diagnostic reasoning skills.
Research Tool: Serves as a foundational model for research into AI-driven clinical decision support systems.

Limitations and Important Considerations

Dataset Size: Trained on a relatively small dataset (just over 1,000 examples), which may limit its generalization to rare or atypical symptom presentations.
Data Imbalance: Some diagnoses are under-represented, potentially leading to biased outputs towards more common conditions.
Synthetic Data: Training data consists of LLM-generated symptom descriptions, which may lack the full variability and nuance of real-world patient narratives.
Not for Clinical Use: This model is strictly intended for research and educational augmentation. It is not a diagnostic tool and should not replace professional medical evaluation or clinical decision-making.