mlxha/Qwen3-8B-grpo-medmcqa
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:May 8, 2025Architecture:Transformer0.0K Cold

The mlxha/Qwen3-8B-grpo-medmcqa model is an 8 billion parameter language model, fine-tuned from Qwen/Qwen3-8B with a 32768 token context length. It was trained using the GRPO method on the medmcqa-grpo dataset, specifically optimizing its performance for medical multiple-choice question answering. This model is designed to enhance reasoning capabilities, particularly in specialized domains like medical knowledge.

Loading preview...