Models & Releases Research·The Decoder·May 25

Google Deepmind's AlphaProof Nexus solves decades-old math problems for a few hundred dollars

Google DeepMind's AlphaProof Nexus represents a shift in how AI tackles formal mathematics: rather than generating natural-language proofs, the system leverages the Lean proof assistant to verify each step automatically, eliminating ambiguity. The system solved nine open Erdős problems, including two unsolved for 56 years, at a cost of hundreds of dollars per problem. While the 2.5 percent success rate signals early-stage capability, the architectural choice to ground reasoning in formal verification rather than language generation signals a broader trend toward AI systems that operate within constrained, verifiable domains. This matters for downstream applications in code generation, theorem proving, and safety-critical domains where proof-checking becomes a competitive moat.

Modelwire context

Analyst take

The cost figure is the buried lede: at a few hundred dollars per solved open problem, the economic barrier to automated theorem proving has dropped far enough that academic and industrial research teams can treat it as a line item rather than a moonshot budget. That changes who can afford to use this, not just who built it.

This is largely disconnected from recent activity in our archive, as we have no prior coverage to anchor it to. It belongs to a cluster of stories about AI systems that trade generality for verifiability, a design philosophy gaining traction across code synthesis and formal methods. The choice to build on Lean rather than produce natural-language proofs is a deliberate constraint that makes outputs auditable, which matters most in safety-critical and regulated domains. OpenAI has been circling adjacent territory with its reasoning models, and the competitive pressure to own the formal verification layer is real, even if neither company has framed it that way publicly.

Watch whether DeepMind publishes a reproducibility package or benchmark suite that lets outside teams validate the 2.5 percent success rate on a standardized Erdős problem set. If that drops within six months and the rate holds, the cost-per-proof figure becomes a credible procurement benchmark; if it doesn't ship, the headline number stays a one-time demonstration.

This analysis is generated by Modelwire’s editorial layer from our archive and the summary above. It is not a substitute for the original reporting. How we write it.

MentionsGoogle DeepMind · AlphaProof Nexus · Lean · Erdős problems · OpenAI

Read full story at The Decoder →(the-decoder.com)

Modelwire Editorial

This synthesis and analysis was prepared by the Modelwire editorial team. We use advanced language models to read, ground, and connect the day’s most significant AI developments, providing original strategic context that helps practitioners and leaders stay ahead of the frontier.

Our mission How we write

Modelwire summarizes, we don’t republish. The full content lives on the-decoder.com. If you’re a publisher and want a different summarization policy for your work, see our takedown page.