TachyHealth/Gazal-R1-32B-GRPO-preview
Gazal-R1-32B-GRPO-preview is a 32.8 billion parameter causal language model developed by TachyHealth, built upon Qwen 3 32B. It is specifically designed and fine-tuned for medical reasoning and clinical decision-making, leveraging a two-stage training pipeline including Group Relative Policy Optimization (GRPO). This model excels at diagnostic reasoning, treatment planning, and prognostic assessment, achieving state-of-the-art performance on medical benchmarks like MedQA and MMLU Pro (Medical).
Loading preview...
Gazal-R1-32B-GRPO-preview: Medical Reasoning Specialist
Gazal-R1 is a 32.8 billion parameter language model from TachyHealth, built on Qwen 3 32B, and specifically engineered for advanced medical reasoning. It utilizes a two-stage training approach, combining Supervised Fine-Tuning (SFT) with Group Relative Policy Optimization (GRPO) for alignment. This methodology enables the model to provide structured clinical thinking, including step-by-step explanations within <think> tags.
Key Capabilities & Performance
- Medical Expertise: Specialized training on over 107,000 synthetic medical reasoning examples, covering diagnostic reasoning, treatment planning, and prognostic assessment.
- State-of-the-Art Benchmarks: Achieves 87.1% on MedQA, 81.6% on MMLU Pro (Medical), and 79.6% on PubMedQA, outperforming models up to 12 times larger (e.g., Llama 3.1 405B Instruct) in medical domains.
- Transparent Reasoning: Generates structured explanations following established clinical reasoning frameworks.
- Parameter Efficiency: Employs advanced techniques like Weight-Decomposed LoRA (DoRA) and Rank-Stabilized LoRA (rsLoRA) for efficient fine-tuning.
- Context Length: Supports a native context length of 32,768 tokens, extensible to 131,072 with YaRN.
Ideal Use Cases
- Research and Education: Excellent for medical education, clinical reasoning research, and academic medical writing assistance.
- Professional Support (Supervised): Can aid in literature review, clinical case analysis, and medical documentation, always requiring verification by qualified professionals.
Important Disclaimers
Gazal-R1 is a research model and NOT intended for direct clinical use, diagnosis, or treatment. All outputs must be independently verified by medical professionals. It has a knowledge cutoff and may hallucinate; it should never be used for emergency medical situations.