ryokamoi/Qwen-2.5-7B-FoVer-PRM-2026
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Apr 7, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The ryokamoi/Qwen-2.5-7B-FoVer-PRM-2026 is a 7.6 billion parameter Process Reward Model (PRM) based on the Qwen 2.5 architecture, developed by Ryo Kamoi and collaborators. It is specifically fine-tuned using the FoVer framework, which synthesizes PRM training data via formal verification tools like Z3 and Isabelle. This model excels at evaluating step-level reasoning traces, particularly in math, logic, and theorem proving tasks, and demonstrates improved performance on informal reasoning benchmarks. Its primary use is to provide efficient and accurate process supervision for enhancing LLM reasoning capabilities without human annotation or extensive LLM calls.

Loading preview...

FoVer: Formally Verified Process Reward Model

The ryokamoi/Qwen-2.5-7B-FoVer-PRM-2026 is a 7.6 billion parameter Process Reward Model (PRM) developed by Ryo Kamoi and collaborators, designed to improve the reasoning capabilities of large language models (LLMs). This model leverages the novel FoVer framework, which generates PRM training data through formal verification tools such as Z3 and Isabelle. This approach allows for the efficient and accurate annotation of step-level error labels in reasoning traces, bypassing the need for costly human annotation or repeated LLM calls.

Key Capabilities

  • Efficient PRM Data Synthesis: Utilizes formal verification to create high-quality training data for process supervision.
  • Enhanced Reasoning Evaluation: Fine-tuned to evaluate the correctness of individual steps in a reasoning process.
  • Broad Applicability: Demonstrates improved performance not only on formal math and logic tasks but also on informal reasoning benchmarks like NLI and BBH.
  • Step-Level Scoring: Provides granular scores for each step in a solution, indicating correctness.

Good For

  • Improving LLM Reasoning: Ideal for developers looking to enhance the logical and mathematical reasoning abilities of LLMs.
  • Automated Process Supervision: Suitable for applications requiring automated feedback on reasoning steps without manual intervention.
  • Research in Formal Verification and LLMs: Useful for exploring the intersection of formal methods and language model development, as detailed in the paper "Efficient PRM Training Data Synthesis via Formal Verification" (ACL 2026 Findings) arXiv:2505.15960.