II-Medical-32B-Preview: Advanced Medical Reasoning Model

Intelligent Internet's II-Medical-32B-Preview is a 32 billion parameter large language model, fine-tuned from the Qwen3-32B architecture. This model is specifically engineered to significantly advance AI capabilities in medical reasoning and question answering. Its development involved supervised fine-tuning (SFT) on a comprehensive collection of medical reasoning datasets, with a max length of 16378 tokens during training.

Key Capabilities & Performance

The model demonstrates strong performance across various medical benchmarks, achieving an average score of 71.54% across 10 medical QA benchmarks. These benchmarks include MedMCQA, MedQA, PubMedQA, HealthBench, MMLU-Pro medical questions, and specialized QA sets from Lancet, the New England Journal of Medicine, MedBullets, and MedXpertQA. Notably, it outperforms its base model, Qwen3-32B, and other medical LLMs like HuatuoGPT-o1-72B and MedGemma-27B-IT on average.

Training Resources & Usage

Intelligent Internet has also released the training datasets used for SFT and Reinforcement Learning (RL), including II-Medical-Reasoning-SFT and RL datasets like II-Medical-RL-MedReason and II-Medical-RL-ChatDoctor. The model can be deployed using tools like vLLM or SGLang. Recommended sampling parameters are temperature = 0.6 and top_p = 0.9, with a suggestion to explicitly request step-by-step reasoning and format the final answer within \boxed{}. It is important to note that this model is not suitable for medical use due to potential biases and the dynamic nature of medical knowledge.

Overview

II-Medical-32B-Preview: Advanced Medical Reasoning Model

Key Capabilities & Performance

Training Resources & Usage

Full Model Card (README)