hdong0/Qwen3-8B-base-Open-R1-GRPO_dapo_acc_16384_nokl
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Oct 7, 2025Architecture:Transformer Cold

hdong0/Qwen3-8B-base-Open-R1-GRPO_dapo_acc_16384_nokl is an 8 billion parameter language model, fine-tuned from the Qwen3-8B-Base architecture. Developed by hdong0, this model specializes in mathematical reasoning, having been trained with the GRPO method on the DAPO-Math-17k-Processed dataset. It is optimized for tasks requiring advanced mathematical problem-solving capabilities, leveraging a 32768 token context length.

Loading preview...