che111/AlphaMed-8B-instruct-rl
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:May 19, 2025License:mitArchitecture:Transformer Open Weights Cold

AlphaMed-8B-instruct-rl by che111 is an 8 billion parameter medical large language model with a 32768 token context length. It is uniquely trained without supervised fine-tuning on chain-of-thought data, relying solely on reinforcement learning to generate step-by-step reasoning for complex medical tasks. This model excels at medical question answering by incentivizing detailed, rule-based reasoning processes.

Loading preview...