introvoyz041/Goedel-Prover-V2-32B
Goedel-Prover-V2-32B by Goedel-LM is a 32 billion parameter language model specifically designed for automated formal proof generation with a 32K context length. It achieves state-of-the-art performance on formal theorem proving benchmarks like MiniF2F and PutnamBench, utilizing scaffolded data synthesis and verifier-guided self-correction. This model excels at generating and refining formal proofs, making it highly suitable for advanced mathematical reasoning and formal verification tasks.
Loading preview...
Goedel-Prover-V2-32B: State-of-the-Art Formal Theorem Prover
Goedel-Prover-V2-32B is a 32 billion parameter language model from Goedel-LM, engineered for advanced automated formal proof generation. It introduces key innovations including scaffolded data synthesis for progressive training on increasingly complex theorems, verifier-guided self-correction using Lean's compiler feedback, and model averaging for improved robustness.
Key Capabilities & Performance
- Achieves 88.0% on MiniF2F at Pass@32 in standard mode and 90.4% with self-correction, significantly outperforming prior state-of-the-art models like DeepSeek-Prover-V2-671B and Kimina-Prover-72B.
- Solves 64 problems on PutnamBench at Pass@64 in self-correction mode, securing the top rank and surpassing DeepSeek-Prover-V2-671B's record of 47 problems.
- Self-correction mode iteratively refines proofs using Lean compiler feedback, enhancing proof quality and efficiency.
- Introduces MathOlympiadBench, a new benchmark of 360 human-verified Olympiad-level mathematical problems, to foster further research.
Ideal Use Cases
- Automated formal verification in mathematics and computer science.
- Research and development in theorem proving and AI reasoning.
- Educational tools for understanding and generating formal proofs.
- Applications requiring highly accurate and verifiable mathematical solutions.