che111/AlphaMed-7B-instruct-rl

TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:May 19, 2025License:mitArchitecture:Transformer Open Weights Cold

che111/AlphaMed-7B-instruct-rl is a 7.6 billion parameter medical large language model developed by che111. It is specifically trained for medical reasoning tasks, uniquely relying on reinforcement learning without supervised fine-tuning on chain-of-thought data. This model is designed to elicit step-by-step reasoning in complex medical scenarios, making it suitable for diagnostic support and medical question answering.

Loading preview...

AlphaMed-7B-instruct-rl Overview

AlphaMed-7B-instruct-rl is a 7.6 billion parameter medical large language model developed by che111. Its core innovation lies in its training methodology: it is trained exclusively with reinforcement learning (RL) to foster step-by-step reasoning in medical contexts, completely bypassing supervised fine-tuning on chain-of-thought (CoT) data.

Key Capabilities

  • Medical Reasoning: Designed to perform complex medical reasoning tasks.
  • Step-by-Step Elicitation: Utilizes RL to generate detailed, sequential reasoning processes for medical questions.
  • Specialized Training: Focuses on medical domain knowledge and problem-solving without reliance on traditional CoT datasets.

Good For

  • Medical Question Answering: Answering specific medical queries requiring diagnostic or treatment reasoning.
  • Diagnostic Support: Assisting in identifying likely diagnoses based on patient symptoms and lab results.
  • Research in RL for Medical LLMs: Exploring models trained with minimalist rule-based RL for medical applications.