Google AlphaProof Nexus Solves 9 Erdős Problems

Google DeepMind‘s AlphaProof Nexus autonomously solved nine open Erdős problems plus 44 open conjectures from the Online Encyclopedia of Integer Sequences.
The result arrived just a day after OpenAI claimed to have disproved a single 80-year-old Erdős conjecture.
Each problem cost a few hundred dollars to solve; two of the Erdős problems had been unsolved for 56 years.
AlphaProof Nexus pairs an LLM with the Lean proof assistant to generate machine-verified proofs.

What Happened

Google DeepMind‘s AlphaProof Nexus — an AI system that generates machine-verified mathematical proofs — autonomously solved nine open Erdős problems just a day after OpenAI claimed its own single Erdős breakthrough, The Rundown reported. The system also proved 44 open conjectures from the Online Encyclopedia of Integer Sequences. Two of the nine Erdős problems had been unsolved for 56 years.

Why It Matters

The 9-to-1 margin reframes the recent narrative about AI’s progress on research-level mathematics. OpenAI’s announcement last week — that its system disproved an 80-year-old Erdős conjecture — was the headline-grabbing claim of the month. Google DeepMind’s quieter and significantly larger result reframes the field: nine problems in a single batch versus OpenAI’s one, with a cost of only a few hundred dollars per problem.

Erdős problems are some of the hardest unsolved questions in mathematics — combinatorics, graph theory, and number theory problems that the mathematician Paul Erdős personally formulated. They have been a benchmark for original mathematical reasoning since Erdős’s death in 1996. AI systems clearing previously-unsolved Erdős problems is a meaningful capability data point.

Technical Details

AlphaProof Nexus pairs an LLM with Lean — the proof assistant that converts mathematical proofs into machine-verifiable code — to generate proofs that can be mechanically checked. The system handled problems spanning combinatorics and graph theory. Each problem cost a few hundred dollars to solve in compute, per Google DeepMind. A simpler version of the agent matched the results but cost more; problems requiring genuinely new mathematical constructions remained out of reach. Of the nine Erdős problems solved, two had been unsolved for 56 years.

The 44 additional OEIS open conjectures were a separate scaling demonstration — autonomous proof generation at meaningful breadth across the published open-conjecture corpus. The result builds on Google’s broader formal-verification work via Lean and the earlier AlphaProof system that hit IMO Gold-level performance in 2024.

Who’s Affected

Research mathematicians gain a tool capable of contributing to open-problem solution at the field’s hardest difficulty level. OpenAI faces a comparative result that contextualizes its earlier single-problem claim. The Lean ecosystem (Microsoft Research, Microsoft, and the broader Lean community) sees significant validation of the formal-verification approach. Other AI labs working on research-level math — including Mira Murati’s Thinking Machines Lab and Safe Superintelligence — face a new ceiling. Mathematicians at the OEIS gain a system that can attack their open-conjecture archive at scale.

What’s Next

Google DeepMind has not announced specific public-access plans for AlphaProof Nexus. The proofs themselves are machine-verifiable in Lean; expect them to be reviewed and incorporated into the broader mathematical literature through standard publication channels. Industry watchers should track parallel announcements from OpenAI, Mira Murati’s Thinking Machines Lab, and the Lean community.

Google AlphaProof Nexus Solves 9 Erdős Problems After OpenAI’s One

What Happened

Why It Matters

Technical Details

Who’s Affected

What’s Next

Enjoyed this story?

Google AlphaProof Nexus Solves 9 Erdős Problems After OpenAI’s One

What Happened

Why It Matters

Technical Details

Who’s Affected

What’s Next

Enjoyed this story?

UK AISI: Benchmarks Underestimate AI Agents by Capping Compute

Claude Fable 5 Tops the Intelligence Index, at Twice the Cost for 5.7% More

SentinelBench Tests Whether AI Agents Can Wait Instead of Acting Constantly