xw1234gan/GRPO_KL_Qwen2.5-3B-Instruct_MedQA_beta0.01_lr1e-05_mb2_ga128_n2048_seed42
TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kPublished:Mar 16, 2026Architecture:Transformer Warm

The xw1234gan/GRPO_KL_Qwen2.5-3B-Instruct_MedQA_beta0.01_lr1e-05_mb2_ga128_n2048_seed42 is a 3.1 billion parameter instruction-tuned model based on the Qwen2.5 architecture. This model is specifically fine-tuned for medical question answering (MedQA) tasks, making it suitable for applications requiring specialized knowledge in the medical domain. Its primary strength lies in processing and generating responses relevant to medical inquiries, distinguishing it from general-purpose language models.

Loading preview...