Google DeepMind’s AlphaProof reaches Olympiad level — and what it means for the future of mathematical proof.
AlphaProof’s performance at the 2024 International Mathematical Olympiad, Nature (2025)
Every summer, the most gifted young mathematicians in the world converge to compete in the International Mathematical Olympiad (IMO) — six problems, nine hours across two days, and a level of difficulty that humbles even exceptional minds. In 2024, an uninvited but remarkable competitor quietly entered the fray: Google DeepMind’s AlphaProof. It scored 28 out of a possible 42 points, placing it squarely in the silver-medal category and just one point short of gold.
What makes this achievement genuinely historic is not that a computer solved hard maths problems — machines have done arithmetic faster than humans for decades. What is new is that AlphaProof proved its answers with 100% verified correctness, step by logical step, using the formal proof assistant Lean. No guessing. No hallucinating a plausible-sounding answer. Every claim in every solution was machine-verified to be logically airtight.
How AlphaProof works
AlphaProof is a neuro-symbolic hybrid — it combines the intuitive pattern-matching of a large language model (based on Google’s Gemini architecture) with the rigorous verification engine of Lean, a formal mathematics software environment. The system was trained in three stages: first absorbing 300 billion tokens of mathematical text and code; then learning from 300,000 expert-written formal proofs; and finally, tackling 80 million formal mathematics problems through reinforcement learning — rewarded for every successful proof it constructed.
At the IMO, AlphaProof solved three non-geometry problems, including what IMO judges considered the hardest problem in the entire competition — a problem solved by only five human contestants. AlphaGeometry 2, a companion system, solved the geometry problem independently.
“AlphaProof is designed to prove mathematical statements — and it guarantees 100% correct solutions by verifying every logical step.” — Nature, 2025
Gold in 2025
The progression was swift. By 2025, an advanced version of Gemini equipped with “Deep Think” reasoning achieved the full gold-medal standard at IMO 2025 — solving all six problems within the time allowed. Unlike the 2024 effort, which required human experts to manually translate problems into formal language, the 2025 system could read and interpret problems in natural language directly. What previously demanded two to three days of computation was achieved within the competition’s standard timeframe.
This is not merely a parlour trick. Mathematicians across fields are beginning to ask: if AI can verify proofs and explore mathematical territories at machine speed, could it help crack problems that have resisted human effort for centuries — the Riemann Hypothesis, the Birch and Swinnerton-Dyer conjecture, P versus NP? AlphaProof does not answer these questions, but it brings them closer to the horizon than at any point in history.
Sources & Further Reading
Hubert, T. et al. (2025). Olympiad-level formal mathematical reasoning with reinforcement learning. Nature. DOI: 10.1038/s41586-025-09833-y
Google DeepMind (2025). AI achieves silver-medal standard solving International Mathematical Olympiad problems. deepmind.google
Google DeepMind (2026). Advanced version of Gemini with Deep Think officially achieves gold-medal standard at the IMO. deepmind.google
Phys.org (2025). AI math genius delivers 100% accurate results. phys.org/news/2025-11-ai-math-genius-accurate-results.html
