GuoxinChen/ReForm-8B is an 8 billion parameter language model developed by Guoxin Chen, designed for reflective autoformalization of mathematical statements. It utilizes a novel Prospective Bounded Sequence Optimization (PBSO) reinforcement learning algorithm to generate, verify, and refine formal mathematical statements. This model excels at autonomously detecting and correcting semantic errors, achieving significant improvements in semantic consistency across formalization benchmarks.
No reviews yet. Be the first to review!