Jarvis1111/DoctorAgent-RL-SFT-1k-Thinking
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Jun 16, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Jarvis1111/DoctorAgent-RL-SFT-1k-Thinking is a 7.6 billion parameter model developed by JarvisUSTC, built upon the Qwen2.5-7B-Instruct base. This model employs a novel reinforcement learning (RL)-based multi-agent collaborative framework to optimize multi-turn clinical dialogues. It excels at dynamic diagnostic reasoning and questioning strategies, aiming to improve diagnostic accuracy and resource allocation in medical consultations. The model is specifically fine-tuned for medical consultation scenarios, addressing challenges like vague diagnoses and static dialogue models.

Loading preview...