openbmb/RLPR-Llama3.1-8B-Inst
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Jun 22, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

The openbmb/RLPR-Llama3.1-8B-Inst model is an 8 billion parameter instruction-tuned language model developed by OpenBMB, built upon Llama3.1-8B-Instruct. It utilizes the RLPR framework for verifier-free reasoning enhancement, leveraging intrinsic generation probability as a direct reward signal. This model excels in general and mathematical reasoning tasks, demonstrating substantial improvements over baselines without requiring external verifiers or specialized fine-tuning.

Loading preview...