ReForm-8B: Reflective Autoformalization for Mathematics

ReForm-8B, developed by Guoxin Chen, is an 8 billion parameter model focused on autoformalization, the process of translating natural language mathematical problems into formal mathematical statements. Unlike traditional single-pass translation, ReForm employs a unique reflective paradigm that iteratively generates, validates, and refines formal statements, allowing it to self-correct semantic errors.

Key Innovations:

Reflective Autoformalization: Implements a "generate–validate–refine" cycle for iterative error detection and correction.
Prospective Bounded Sequence Optimization (PBSO): A novel reinforcement learning algorithm that uses position-specific rewards to ensure stable and interpretable optimization for both task accuracy and critique quality.

Performance:

ReForm-8B demonstrates state-of-the-art semantic consistency, achieving an average of +22.6 percentage point improvement over strong baselines across four formalization benchmarks: miniF2F, ProofNet, Putnam, and AIME 2025. This indicates its superior ability to produce semantically accurate formalizations.

Ideal Use Cases:

Formalizing mathematical proofs: Translating natural language mathematical problems into formal systems like Lean 4.
Automated theorem proving: Generating accurate formal statements as input for proof assistants.
Educational tools: Assisting students and researchers in understanding and formalizing complex mathematical concepts.