DiagAgent-8B: RL-Optimized Diagnostic Agent
DiagAgent-8B is an 8 billion parameter large language model developed by Henrychur, specifically designed for interactive, multi-turn medical diagnostic reasoning. Unlike traditional one-shot medical LLMs, DiagAgent-8B is optimized through reinforcement learning (GRPO) within the DiagGym virtual clinical environment. This training methodology allows it to safely learn complex diagnostic workflows, including recommending informative examinations, dynamically updating working diagnoses as new information becomes available, and deciding the optimal point to commit to a final diagnosis.
Key Capabilities
- Interactive Diagnostic Reasoning: Engages in multi-turn interactions to gather information and refine diagnoses.
- Examination Recommendation: Suggests the most informative examinations based on patient data and current diagnostic hypotheses.
- Dynamic Diagnosis Updates: Adapts its working diagnosis as new evidence is presented.
- Decision to Finalize Diagnosis: Determines when sufficient information has been collected to make a conclusive diagnosis.
- Reinforcement Learning Optimization: Trained end-to-end using GRPO in a closed-loop virtual environment, ensuring robust and safe learning.
Performance Highlights
DiagAgent-8B demonstrates strong performance in medical diagnostic tasks, particularly in multi-turn scenarios. In end-to-end evaluations, it achieves an F1 score of 43.02 and an accuracy of 53.85, significantly outperforming many larger basic LLMs and other agentic systems like GPT-4o, Claude-4-sonnet, and Llama3.3 in these specific metrics. Its training focuses on optimizing for diagnosis accuracy, examination recommendation F1, and minimizing interaction turns.
Good for
- Medical AI Assistants: Building AI systems that can simulate clinical diagnostic processes.
- Clinical Decision Support: Assisting healthcare professionals with complex diagnostic pathways.
- Medical Education & Training: Providing a safe, interactive environment for learning diagnostic reasoning.
- Research in Agentic LLMs: Exploring the application of reinforcement learning for complex, multi-step reasoning tasks in specialized domains.