Name: ryokamoi/Llama-3.1-8B-FoVer-PRM-old API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: ryokamoi

ryokamoi/Llama-3.1-8B-FoVer-PRM-old: Formal Verification for LLM Reasoning

This model is an 8 billion parameter Process Reward Model (PRM) based on Llama 3.1, developed by Ryo Kamoi and the PSU NLP Group. It is designed to provide step-level feedback on the reasoning generated by large language models (LLMs), enhancing their capabilities through reinforcement learning and inference-time refinement. The model leverages a novel approach called FoVer, which synthesizes PRM training data using formal verification tools like Z3 and Isabelle to automatically annotate step-level errors.

Key Capabilities

Automated Error Annotation: Utilizes formal verification to generate precise step-level error labels for LLM responses.
Cross-Task Transfer: Demonstrates the ability to transfer verification capabilities learned in formal logic and proof tasks to a broad range of other reasoning tasks, including mathematics, academic problems, and abstract reasoning.
Step-Level Feedback: Provides granular feedback on individual steps within an LLM's reasoning process, crucial for improving complex problem-solving.
High Context Length: Supports a context length of 32768 tokens, allowing for analysis of extensive reasoning chains.

Good For

Training LLMs: Ideal for researchers and developers looking to train or fine-tune LLMs with robust step-level feedback for improved reasoning.
Evaluating Reasoning: Can be used as an evaluation benchmark for PRMs on formal logic and proof tasks.
Formal Verification Tasks: Particularly strong in verifying steps related to formal logic and mathematical proofs.
Enhancing LLM Reliability: Useful for applications requiring high reliability in LLM-generated reasoning, such as scientific discovery or complex problem-solving systems.

Overview

ryokamoi/Llama-3.1-8B-FoVer-PRM-old: Formal Verification for LLM Reasoning

Key Capabilities

Good For

Full Model Card (README)