Intelligent-Internet/II-Medical-32B-Preview
Hugging Face
TEXT GENERATIONConcurrency Cost:2Model Size:32BQuant:FP8Ctx Length:32kPublished:Jul 3, 2025License:apache-2.0Architecture:Transformer0.1K Open Weights Warm

The II-Medical-32B-Preview is a 32 billion parameter large language model developed by Intelligent Internet, fine-tuned from Qwen3-32B. It is specifically designed to enhance AI-driven medical reasoning and medical question answering. This model achieves an average score of 71.54% across 10 medical QA benchmarks, demonstrating strong performance in specialized medical contexts. It is optimized for complex medical reasoning tasks, leveraging a comprehensive set of medical reasoning datasets for its training.

Loading preview...

II-Medical-32B-Preview: Advanced Medical Reasoning Model

Intelligent Internet's II-Medical-32B-Preview is a 32 billion parameter large language model, fine-tuned from the Qwen3-32B architecture. This model is specifically engineered to significantly advance AI capabilities in medical reasoning and question answering. Its development involved supervised fine-tuning (SFT) on a comprehensive collection of medical reasoning datasets, with a max length of 16378 tokens during training.

Key Capabilities & Performance

The model demonstrates strong performance across various medical benchmarks, achieving an average score of 71.54% across 10 medical QA benchmarks. These benchmarks include MedMCQA, MedQA, PubMedQA, HealthBench, MMLU-Pro medical questions, and specialized QA sets from Lancet, the New England Journal of Medicine, MedBullets, and MedXpertQA. Notably, it outperforms its base model, Qwen3-32B, and other medical LLMs like HuatuoGPT-o1-72B and MedGemma-27B-IT on average.

Training Resources & Usage

Intelligent Internet has also released the training datasets used for SFT and Reinforcement Learning (RL), including II-Medical-Reasoning-SFT and RL datasets like II-Medical-RL-MedReason and II-Medical-RL-ChatDoctor. The model can be deployed using tools like vLLM or SGLang. Recommended sampling parameters are temperature = 0.6 and top_p = 0.9, with a suggestion to explicitly request step-by-step reasoning and format the final answer within \boxed{}. It is important to note that this model is not suitable for medical use due to potential biases and the dynamic nature of medical knowledge.