ypwang61/One-Shot-RLVR-Qwen2.5-Math-1.5B-pi1
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:May 17, 2025License:apache-2.0Architecture:Transformer Open Weights Loading
The ypwang61/One-Shot-RLVR-Qwen2.5-Math-1.5B-pi1 is a 1.5 billion parameter language model based on the Qwen2.5 architecture, featuring a 32768 token context length. Developed by ypwang61, this model is specifically designed for enhanced reasoning capabilities in mathematical tasks. It leverages Reinforcement Learning for Reasoning (RLVR) with a focus on one-shot learning, making it particularly effective for complex problem-solving with minimal examples.
Loading preview...