openbmb/RLPR-Qwen2.5-7B-Base
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Jun 22, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

The openbmb/RLPR-Qwen2.5-7B-Base is a 7.6 billion parameter language model developed by OpenBMB, fine-tuned from Qwen2.5-7B-Base using the RLPR framework. This model specializes in reasoning tasks by employing a verifier-free reinforcement learning approach, utilizing intrinsic generation probability as a reward signal. It demonstrates strong performance in general and mathematical reasoning benchmarks, making it suitable for applications requiring robust logical inference without external verification. The model has a context length of 131072 tokens.

Loading preview...