IQuestLab/Fleming-R1-7B
Fleming-R1-7B by UbiquantAI is a 7.6 billion parameter reasoning model for medical scenarios, built on Qwen2.5-7B, with a 32K context length. It specializes in step-by-step analysis of complex medical problems, utilizing a "chain-of-thought cold start" and large-scale reinforcement learning. The model achieves state-of-the-art performance among similar-sized models on multiple medical benchmarks, particularly excelling in medical reasoning ability.
Loading preview...
Overview
Fleming-R1-7B is a 7.6 billion parameter medical reasoning model developed by UbiquantAI, based on the Qwen2.5-7B architecture. It is designed to perform step-by-step analysis of complex medical problems and provide reliable answers. The model employs a unique training paradigm involving a "chain-of-thought cold start" and two-stage reinforcement learning, which includes adaptive hard-negative mining to enhance its reasoning capabilities for challenging problems.
Key Capabilities
- Specialized Medical Reasoning: Optimized for medical scenarios, capable of detailed step-by-step analysis.
- State-of-the-Art Performance: Achieves leading results on multiple medical benchmarks among models of comparable size.
- Enhanced Data Strategy: Combines public medical datasets with knowledge graphs to improve coverage of rare diseases, medications, and multi-hop reasoning chains.
- Reinforcement Learning: Utilizes high-quality reasoning traces from teacher models and adaptive hard-negative mining to strengthen problem-solving.
Good For
- Medical Research: Analyzing complex medical cases and generating reasoning traces for research purposes.
- Non-Clinical Reference: Providing detailed information and step-by-step analysis for educational or informational use in medical contexts.
- Benchmarking Medical LLMs: Evaluating and comparing the reasoning abilities of language models in healthcare domains, particularly on benchmarks like MedXpertQA.