Overview
Goedel-Prover-V2-8B: State-of-the-Art Theorem Proving
Goedel-Prover-V2-8B is an 8 billion parameter model from the Goedel-Prover-V2 series, setting a new benchmark in open-source automated formal proof generation. Developed by Goedel-LM, this model incorporates several key innovations:
Key Capabilities & Innovations
- Scaffolded Data Synthesis: Progressively trains the model on synthetic proof tasks of increasing difficulty.
- Verifier-Guided Self-Correction: Leverages Lean's compiler feedback to iteratively revise and improve proofs, mimicking human refinement processes. The model can perform multiple rounds of self-correction, enhancing proof quality.
- Model Averaging: Combines multiple model checkpoints for improved robustness and performance.
- Exceptional Efficiency: The 8B model achieves 83.0% on the MiniF2F test set at Pass@32, matching the performance of the 100 times larger DeepSeek-Prover-V2-671B.
- New Benchmark: Introduced MathOlympiadBench, a dataset of 360 human-verified Olympiad-level mathematical competition problems, to foster further research.
Use Cases
- Formal Theorem Proving: Ideal for generating and verifying formal proofs in environments like Lean 4.
- Mathematical Reasoning: Excels in complex mathematical problem-solving, as demonstrated by its performance on MiniF2F and PutnamBench.
- Research & Development: Released to aid open-source projects, particularly those focused on mathematical competitions like the IMO.