IQuestLab/Fleming-R1-7B

TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Sep 16, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Fleming-R1-7B by UbiquantAI is a 7.6 billion parameter reasoning model for medical scenarios, built on Qwen2.5-7B, with a 32K context length. It specializes in step-by-step analysis of complex medical problems, utilizing a "chain-of-thought cold start" and large-scale reinforcement learning. The model achieves state-of-the-art performance among similar-sized models on multiple medical benchmarks, particularly excelling in medical reasoning ability.

Loading preview...

Overview

Fleming-R1-7B is a 7.6 billion parameter medical reasoning model developed by UbiquantAI, based on the Qwen2.5-7B architecture. It is designed to perform step-by-step analysis of complex medical problems and provide reliable answers. The model employs a unique training paradigm involving a "chain-of-thought cold start" and two-stage reinforcement learning, which includes adaptive hard-negative mining to enhance its reasoning capabilities for challenging problems.

Key Capabilities

  • Specialized Medical Reasoning: Optimized for medical scenarios, capable of detailed step-by-step analysis.
  • State-of-the-Art Performance: Achieves leading results on multiple medical benchmarks among models of comparable size.
  • Enhanced Data Strategy: Combines public medical datasets with knowledge graphs to improve coverage of rare diseases, medications, and multi-hop reasoning chains.
  • Reinforcement Learning: Utilizes high-quality reasoning traces from teacher models and adaptive hard-negative mining to strengthen problem-solving.

Good For

  • Medical Research: Analyzing complex medical cases and generating reasoning traces for research purposes.
  • Non-Clinical Reference: Providing detailed information and step-by-step analysis for educational or informational use in medical contexts.
  • Benchmarking Medical LLMs: Evaluating and comparing the reasoning abilities of language models in healthcare domains, particularly on benchmarks like MedXpertQA.