baichuan-inc/Baichuan-M2-32B

Hugging Face
TEXT GENERATIONConcurrency Cost:2Model Size:32.8BQuant:FP8Ctx Length:32kPublished:Aug 10, 2025License:apache-2.0Architecture:Transformer0.1K Open Weights Warm

Baichuan-M2-32B is Baichuan AI's 32 billion parameter medical-enhanced reasoning model, built upon Qwen2.5-32B. It integrates an innovative Large Verifier System and medical domain adaptation through Mid-Training and multi-stage reinforcement learning. This model is specifically optimized for real-world medical reasoning tasks, achieving leading performance among open-source medical models on HealthBench while maintaining strong general capabilities.

Loading preview...

Baichuan-M2-32B: Medical-Enhanced Reasoning Model

Baichuan-M2-32B is Baichuan AI's second medical model, designed to excel in real-world medical reasoning tasks. Built upon the Qwen2.5-32B architecture, it introduces a novel Large Verifier System and advanced medical domain adaptation techniques to achieve breakthrough medical performance while preserving general capabilities.

Key Innovations & Capabilities

  • Large Verifier System: Incorporates a comprehensive medical verification framework with patient simulators and multi-dimensional verification (8 dimensions including medical accuracy and completeness).
  • Medical Domain Adaptation: Utilizes Mid-Training for efficient medical knowledge injection and a multi-stage reinforcement learning strategy to enhance medical knowledge, reasoning, and patient interaction.
  • Doctor-Thinking Alignment: Trained on real clinical cases and patient simulators, fostering clinical diagnostic thinking and robust patient interaction.
  • Leading Medical Performance: Achieves the world's leading open-source medical model status, outperforming all open-source and many proprietary models on HealthBench, with medical capabilities closest to GPT-5.
  • Efficient Deployment: Supports 4-bit quantization, enabling deployment on a single RTX4090, and offers 58.5% higher token throughput in its MTP version for single-user scenarios.

Performance Highlights

Baichuan-M2-32B demonstrates superior performance on medical benchmarks like HealthBench, scoring 60.1 overall, 34.7 on HealthBench-Hard, and 91.5 on HealthBench-Consensus, surpassing models like gpt-oss-120b and Qwen3-235B-A22B-Thinking-2507. It also shows strong general performance, outperforming Qwen3-32B (Thinking) on benchmarks such as AIME24, Arena-Hard-v2.0, CFBench, and WritingBench.

Intended Use Cases

  • Medical education
  • Health consultation
  • Clinical decision support

Note: This model is for research and reference only and cannot replace professional medical diagnosis or treatment. It is recommended for use under the guidance of medical professionals.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
–
frequency_penalty
presence_penalty
repetition_penalty
–
min_p
–