mimoidochi/OpenRS-GRPO-1
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Mar 16, 2026Architecture:Transformer Warm

mimoidochi/OpenRS-GRPO-1 is a 1.5 billion parameter language model fine-tuned from DeepSeek-R1-Distill-Qwen-1.5B. This model was trained using the GRPO method, which is designed to enhance mathematical reasoning capabilities. It is specifically optimized for tasks requiring robust reasoning, leveraging its training on the open-rs dataset. With a 32768 token context length, it is suitable for applications demanding detailed contextual understanding and logical inference.

Loading preview...