Name: ryokamoi/Qwen-2.5-7B-FoVer-PRM-2026 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: ryokamoi

FoVer: Formally Verified Process Reward Model

The ryokamoi/Qwen-2.5-7B-FoVer-PRM-2026 is a 7.6 billion parameter Process Reward Model (PRM) developed by Ryo Kamoi and collaborators, designed to improve the reasoning capabilities of large language models (LLMs). This model leverages the novel FoVer framework, which generates PRM training data through formal verification tools such as Z3 and Isabelle. This approach allows for the efficient and accurate annotation of step-level error labels in reasoning traces, bypassing the need for costly human annotation or repeated LLM calls.

Key Capabilities

Efficient PRM Data Synthesis: Utilizes formal verification to create high-quality training data for process supervision.
Enhanced Reasoning Evaluation: Fine-tuned to evaluate the correctness of individual steps in a reasoning process.
Broad Applicability: Demonstrates improved performance not only on formal math and logic tasks but also on informal reasoning benchmarks like NLI and BBH.
Step-Level Scoring: Provides granular scores for each step in a solution, indicating correctness.

Good For

Improving LLM Reasoning: Ideal for developers looking to enhance the logical and mathematical reasoning abilities of LLMs.
Automated Process Supervision: Suitable for applications requiring automated feedback on reasoning steps without manual intervention.
Research in Formal Verification and LLMs: Useful for exploring the intersection of formal methods and language model development, as detailed in the paper "Efficient PRM Training Data Synthesis via Formal Verification" (ACL 2026 Findings) arXiv:2505.15960.

Overview

FoVer: Formally Verified Process Reward Model

Key Capabilities

Good For

Full Model Card (README)