qvac/MedPsy-4B
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Apr 28, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Warm
MedPsy-4B is a 4 billion parameter, text-only causal language model developed by Tether AI Research, built on Qwen3-4B-Thinking-2507. Post-trained with a multi-stage SFT and RL pipeline on curated medical data, it excels in medical and healthcare applications. This model surpasses larger models on medical benchmarks and is optimized for efficient, privacy-first edge deployment, offering 3.2x token efficiency.
Loading preview...
MedPsy-4B: A Compact, High-Performance Medical LLM
MedPsy-4B, developed by Tether AI Research, is a 4 billion parameter, text-only medical and healthcare language model. Built upon the Qwen3-4B-Thinking-2507 base, it undergoes a multi-stage post-training process involving supervised fine-tuning (SFT) and reinforcement learning (RL) on specialized medical datasets.
Key Capabilities & Differentiators
- Exceptional Medical Performance: Achieves a score of 70.54 on closed-ended medical benchmarks, outperforming models nearly 7x its size, such as MedGemma-27B-text-it (69.95).
- Real-World Clinical Strength: Scores 74.00 on HealthBench and 58.00 on HealthBench Hard, significantly surpassing MedGemma-27B.
- High Token Efficiency: Demonstrates a 3.2x reduction in average response length compared to its backbone model, leading to faster inference, lower compute costs, and reduced latency.
- Privacy-First Design: Supports fully on-device inference via the QVAC SDK, ensuring patient data remains on the device.
Intended Use Cases
- Research: Ideal for medical language understanding and reasoning studies.
- Developer Tools: Suitable for building prototypes and tools for health-related applications.
- On-Device Retrieval: Excellent for privacy-sensitive medical information retrieval in edge environments.
Limitations
- Not a Medical Substitute: Outputs are not a substitute for professional medical judgment or clinical diagnosis.
- Hallucinations: May generate plausible but incorrect medical information.
- English Only: Performance is validated primarily in English.
- Text Only: Cannot interpret non-text medical data like images or lab results.
- Knowledge Cutoff: Does not reflect the latest medical guidelines or evidence.