GuoxinChen/ReForm-8B
GuoxinChen/ReForm-8B is an 8 billion parameter language model developed by Guoxin Chen, designed for reflective autoformalization of mathematical statements. It utilizes a novel Prospective Bounded Sequence Optimization (PBSO) reinforcement learning algorithm to generate, verify, and refine formal mathematical statements. This model excels at autonomously detecting and correcting semantic errors, achieving significant improvements in semantic consistency across formalization benchmarks.
Loading preview...
ReForm-8B: Reflective Autoformalization for Mathematics
ReForm-8B, developed by Guoxin Chen, is an 8 billion parameter model focused on autoformalization, the process of translating natural language mathematical problems into formal mathematical statements. Unlike traditional single-pass translation, ReForm employs a unique reflective paradigm that iteratively generates, validates, and refines formal statements, allowing it to self-correct semantic errors.
Key Innovations:
- Reflective Autoformalization: Implements a "generate–validate–refine" cycle for iterative error detection and correction.
- Prospective Bounded Sequence Optimization (PBSO): A novel reinforcement learning algorithm that uses position-specific rewards to ensure stable and interpretable optimization for both task accuracy and critique quality.
Performance:
ReForm-8B demonstrates state-of-the-art semantic consistency, achieving an average of +22.6 percentage point improvement over strong baselines across four formalization benchmarks: miniF2F, ProofNet, Putnam, and AIME 2025. This indicates its superior ability to produce semantically accurate formalizations.
Ideal Use Cases:
- Formalizing mathematical proofs: Translating natural language mathematical problems into formal systems like Lean 4.
- Automated theorem proving: Generating accurate formal statements as input for proof assistants.
- Educational tools: Assisting students and researchers in understanding and formalizing complex mathematical concepts.