ypwang61/One-Shot-RLVR-Qwen2.5-Math-7B-1.2k-dsr-sub
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Aug 27, 2025License:apache-2.0Architecture:Transformer Open Weights Cold

The ypwang61/One-Shot-RLVR-Qwen2.5-Math-7B-1.2k-dsr-sub model is a specialized language model developed by ypwang61, based on the Qwen2.5 architecture. It is fine-tuned for mathematical reasoning tasks using a novel Reinforcement Learning for Reasoning (RLVR) approach with only one training example. This model is designed to excel in complex mathematical problem-solving, offering enhanced reasoning capabilities for specific numerical and logical challenges.

Loading preview...