Lucien520/Qwen2.5-1.5B-Open-R1-GRPO
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Dec 5, 2025Architecture:Transformer Cold
Lucien520/Qwen2.5-1.5B-Open-R1-GRPO is a 1.5 billion parameter language model fine-tuned using the GRPO method, which is designed to enhance mathematical reasoning capabilities. This model leverages the Qwen2.5 architecture and is specifically optimized for tasks requiring robust mathematical problem-solving. It is suitable for applications where strong numerical and logical reasoning are critical, building upon the DeepSeekMath research. The model has a context length of 131072 tokens.
Loading preview...