Researchers at Google DeepMind have developed an artificial intelligence system named AlphaEvolve that discovers new, complex mathematical structures. This AI agent has successfully advanced the understanding of complexity theory, a fundamental area of theoretical computer science, by improving on long-standing problems that have challenged researchers for years.
The system operates by iteratively evolving computer code to find and verify combinatorial structures. These discoveries have led to new, proven theorems that push the boundaries of what is known about the limits of computation and approximation algorithms.
Key Takeaways
- Google's AlphaEvolve is an AI agent that uses large language models to evolve code and discover mathematical structures.
- The AI improved the state-of-the-art limit for approximating a complex optimization problem known as MAX-4-CUT.
- It also discovered new types of graphs, called Ramanujan graphs, that helped tighten bounds on the average-case hardness of other problems.
- AlphaEvolve's approach is unique because it finds verifiable proof components, not entire proofs, ensuring mathematical correctness.
Introducing AlphaEvolve
Google DeepMind's AlphaEvolve is a system designed to tackle complex mathematical and computer science challenges. Unlike AI models that generate text or images, AlphaEvolve is a coding agent that refines populations of code snippets through an evolutionary process.
The system starts with a set of code snippets and evaluates the mathematical structures they produce. Using a feedback loop, a large language model (LLM) then modifies the most successful snippets to generate even better solutions. This iterative process allows the AI to explore vast and complex search spaces far beyond human capacity.
What is Complexity Theory?
Complexity theory is a branch of theoretical computer science that classifies computational problems according to their inherent difficulty. It helps researchers understand which problems can be solved efficiently by computers and which are likely to remain intractable, even with immense processing power.
This approach has yielded significant results in two distinct areas of complexity theory, demonstrating a new way for AI to act as a collaborator in fundamental research.
A New Approach to Universal Proofs
A central challenge in using AI for theoretical research is moving from specific solutions to universal theorems. An AI might solve a problem for one specific case, but computer scientists need proofs that hold true for all possible instances.
AlphaEvolve overcomes this by using a technique called "lifting." In this method, researchers use established mathematical proof frameworks that depend on specific, highly optimized finite structures. The AI's job is not to create the entire proof but to find a better version of this small, self-contained structure.
"If a better structure can be found, the entire proof framework 'lifts' this improvement to a better universal result," the researchers explained. "The advantage of this approach is that to certify overall correctness, one needs to only certify the correctness of the finite structure that has been evolved."
This method allows the AI to focus on a manageable but critical part of the problem. Once a superior structure is discovered and verified, it can be plugged into the existing framework to immediately produce a new, more powerful universal theorem.
Improving a Classic Optimization Problem
One of the key achievements was applied to the MAX-k-CUT problem. In this problem, the goal is to divide the nodes of a network into 'k' separate groups to maximize the number of connections that cross between different groups. This is a classic NP-hard problem, meaning no efficient algorithm is known to find the perfect solution every time.
Researchers instead focus on approximation algorithms, which find solutions guaranteed to be within a certain percentage of the optimal one. A major question is determining the absolute limit of this approximation, known as the "inapproximability" bound.
For the MAX-4-CUT problem (dividing a network into four groups), the previous best-known result established that it is NP-hard to find a solution that is better than 98.83% of the optimal one.
AlphaEvolve was tasked with finding a better "gadget," a finite structure used to prove this hardness. The AI discovered a highly intricate gadget involving 19 variables with a complex weighting system. This new structure established a new, tighter inapproximability bound of 98.7%.
While the numerical change seems small, such improvements in a mature field like hardness of approximation are considered significant breakthroughs that often require entirely new insights.
Hardness of Problems on Average
The team also used AlphaEvolve to study the average-case hardness of problems, which examines how difficult a problem is for typical instances rather than just the worst-possible scenarios. They focused on certifying properties of sparse random graphs, which are networks that mimic the structure of many real-world systems.
This research is connected to the existence of special graphs called Ramanujan graphs. Previous work used computer assistance to find such graphs with up to 10 nodes. Finding larger and more extreme examples is an exceptionally difficult search problem.
AlphaEvolve successfully navigated this search, discovering new Ramanujan graphs with the desired properties on as many as 163 nodes. These discoveries significantly improved the known lower bounds for average-case hardness. Combined with other non-AI algorithmic progress, the results have nearly settled the computational hardness of these questions, matching the upper and lower bounds to the third decimal place.
Verification Remains Key
A critical aspect of this research is the absolute requirement for correctness in mathematics. Unlike other AI applications where subjectivity is acceptable, a mathematical proof must be perfect.
When an LLM is asked to generate a proof directly, it can produce errors or "hallucinations." The AlphaEvolve method avoids this by having the AI discover a structure, which is then verified by a separate, rigorous computer program. The final theorem's validity rests on the proven mathematical framework and the computationally verified structure.
Interestingly, AlphaEvolve also contributed to the verification process itself. It developed sophisticated optimization strategies that resulted in a 10,000x speedup in verification time. This speedup was crucial for allowing the system to explore more complex structures. However, to ensure absolute certainty, the final discoveries were still confirmed using the original, slower brute-force algorithms.
According to the researchers, these findings suggest AI is becoming a valuable collaborator in mathematical discovery. As AI-generated proofs become more common, the task of verification will become an even more critical bottleneck in scientific progress.