DiagAgent-7B: RL-Optimized Diagnostic Agent
DiagAgent-7B is a 7.6 billion parameter large language model developed by Henrychur, specifically optimized for interactive, multi-turn diagnostic reasoning in medical applications. Unlike traditional one-shot medical LLMs, DiagAgent-7B is designed to function as an agent, capable of recommending informative examinations, iteratively updating a working diagnosis as new information becomes available, and deciding when to commit to a final diagnosis.
Key Capabilities & Differentiators
- Reinforcement Learning Optimization: Trained end-to-end using multi-turn Reinforcement Learning (GRPO) within the
DiagGym virtual clinical environment, allowing for safe, closed-loop learning. - Interactive Diagnostic Reasoning: Supports multi-turn interactions, enabling a dynamic diagnostic process that mimics real-world clinical workflows.
- High Context Length: Features a substantial context length of 131072 tokens, facilitating comprehensive analysis of patient information over extended interactions.
- Superior Performance in Agentic Tasks: Evaluation results show DiagAgent-7B (and its 14B variant) significantly outperforms basic LLMs and other agentic systems in single-turn and end-to-end diagnostic metrics, including Hit Ratio (71.12% for 7B) and End-to-End Accuracy (60.78% for 7B).
Ideal Use Cases
- Medical AI Assistants: Developing AI systems that can engage in complex diagnostic dialogues with patient data.
- Clinical Decision Support: Assisting healthcare professionals by suggesting relevant tests and refining diagnoses based on evolving patient information.
- Medical Education & Training: Simulating diagnostic scenarios for training purposes in a risk-free virtual environment.