II-Medical-7B-Preview: A Specialized Medical Reasoning Model
II-Medical-7B-Preview is a 7.6 billion parameter language model developed by fgewfskjfsd, specifically engineered for advanced medical reasoning. It is built upon the Qwen/Qwen2.5-7B-Instruct architecture and has undergone extensive fine-tuning, including Supervised Fine-Tuning (SFT) and Deep Actor-Critic Policy Optimization (DAPO) on a curated dataset of medical knowledge.
Key Capabilities & Features
- Medical Reasoning: Optimized for complex medical question answering and reasoning tasks across various benchmarks.
- Comprehensive Training Data: Trained on a diverse dataset of 555,000 samples, including public medical reasoning datasets, synthetic medical QA data generated with QwQ, and curated medical R1 traces.
- Robust Evaluation: Evaluated across ten medical QA benchmarks, including MedMCQA, MedQA, PubMedQA, MMLU-Pro (medical), GPQA, Lancet, NEJM, MedBullets, and MedXpertQA.
- Performance: Achieves an average score of 66.4 across these benchmarks, outperforming its base model (Qwen2.5-7B-IT) and several other medical models in its class.
- Data Decontamination: Utilizes a two-step decontamination process (10-grams and fuzzy decontamination) to ensure evaluation integrity.
Usage Guidelines & Considerations
- Recommended Parameters: Use
temperature = 0.6 and top_p = 0.9 for optimal sampling. - Structured Output: Users are advised to explicitly request step-by-step reasoning and format the final answer within
\boxed{} for best results. - Limitations: The model's dataset may contain inherent biases, and medical knowledge requires regular updates. It is not suitable for direct medical use or clinical decision-making.
This model is ideal for developers and researchers building AI applications that require strong medical reasoning capabilities, such as medical information retrieval, educational tools, or research assistance.