Intelligent-Internet/II-Medical-8B

Warm
Public
8B
FP8
32768
License: apache-2.0
Hugging Face
Overview

II-Medical-8B: Advanced Medical Reasoning Model

II-Medical-8B, developed by Intelligent Internet, is an 8 billion parameter large language model built upon the Qwen/Qwen3-8B architecture. It is specifically designed to enhance AI-driven medical reasoning and question answering capabilities.

Key Capabilities & Training:

  • Medical Reasoning: Engineered for advanced medical question answering, leveraging a comprehensive set of reasoning datasets.
  • Training Methodology: Utilizes Supervised Fine-Tuning (SFT) on Qwen/Qwen3-8B, followed by DAPO optimization on a hard-reasoning dataset to boost performance.
  • Extensive Dataset: Trained on over 555,000 samples, including public medical reasoning datasets, synthetic medical QA data generated with QwQ, and curated medical R1 traces.
  • Data Curation: Employs a sophisticated data curation pipeline involving embedding generation, K-means clustering, domain classification, and rigorous decontamination to ensure data quality and relevance.

Performance & Evaluation:

  • HealthBench Score: Achieved a 40% score on HealthBench, an open-source benchmark for healthcare LLMs, demonstrating performance comparable to OpenAI's o1 reasoning model and GPT-4.5.
  • Benchmark Excellence: Shows strong results across ten medical QA benchmarks, including MedMCQA, MedQA, PubMedQA, MMLU-Pro, and GPQA, often outperforming other 8B-class medical models.

Usage Guidelines:

  • Recommended Parameters: Use temperature = 0.6 and top_p = 0.9 for optimal sampling.
  • Reasoning Format: Users are advised to explicitly request step-by-step reasoning and format the final answer within \boxed{} for best results.

Limitations:

  • The model's dataset may contain inherent biases from source materials.
  • Medical knowledge requires regular updates, which may not be reflected in the current training data.
  • Not suitable for direct medical use or clinical decision-making.