Yukang/Qwen2.5-7B-Open-R1-GRPO
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Jun 19, 2025Architecture:Transformer Cold

Yukang/Qwen2.5-7B-Open-R1-GRPO is a 7.6 billion parameter language model, fine-tuned from Qwen/Qwen2.5-7B-Instruct. It leverages the GRPO method and the OpenR1-Math-220k dataset, specifically optimizing its capabilities for advanced mathematical reasoning tasks. This model is designed to enhance performance in complex mathematical problem-solving.

Loading preview...