Jarvis1111/DoctorAgent-RL-SFT-1k-Thinking

TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Jun 16, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Jarvis1111/DoctorAgent-RL-SFT-1k-Thinking is a 7.6 billion parameter model developed by JarvisUSTC, built upon the Qwen2.5-7B-Instruct base. This model employs a novel reinforcement learning (RL)-based multi-agent collaborative framework to optimize multi-turn clinical dialogues. It excels at dynamic diagnostic reasoning and questioning strategies, aiming to improve diagnostic accuracy and resource allocation in medical consultations. The model is specifically fine-tuned for medical consultation scenarios, addressing challenges like vague diagnoses and static dialogue models.

Loading preview...

DoctorAgent-RL: Multi-Agent Clinical Dialogue System

DoctorAgent-RL is a 7.6 billion parameter model developed by JarvisUSTC, based on the Qwen2.5-7B-Instruct architecture. It introduces a novel reinforcement learning (RL)-based multi-agent collaborative framework designed to model medical consultations as dynamic decision-making processes under uncertainty. This system features a Doctor Agent that continuously optimizes its questioning strategy through multi-turn interactions with a Patient Agent, guided by comprehensive rewards from a Consultation Evaluator.

Key Capabilities

  • Dynamic Strategy Optimization: Utilizes reinforcement learning for continuous policy updates, enabling adaptive dialogue behavior and information gathering aligned with clinical reasoning.
  • Multi-Agent Collaboration: Employs distinct Doctor and Patient agents, each with specific roles, to simulate realistic medical consultations.
  • Comprehensive Reward Design: Guides optimal strategies using multi-dimensional evaluation metrics for consultation quality.
  • Enhanced Diagnostic Performance: Experiments demonstrate superior multi-turn reasoning and final diagnostic accuracy compared to existing models.
  • MTMedDialog Dataset: Introduces the first English multi-turn medical consultation dataset, facilitating patient interaction simulations.

Good For

  • Improving Diagnostic Accuracy: Reduces misdiagnosis risks by enabling more thorough and adaptive information gathering.
  • Optimizing Medical Resource Allocation: Streamlines consultation processes through efficient and clinically aligned dialogue strategies.
  • Simulating Clinical Scenarios: Provides a robust framework for training and evaluating AI in complex medical dialogue environments.