Google's DeepmindAI can solve math problems on par with top human solvers

February 10, 2025

Please read for 3 minutes

Google’s AI can beat the smartest high school students in math

Google’s alphageometry2 AI reaches the level of gold medal students in International Mathematical Olympiad

Davide Castelvecchi & Nature Magazine

Blue cube pyramid with blue sky background. — The problem with the AI Alphagemetry2 Aced in Google Deepmind is set in the International Mathematical Olympiad.

WireStock, Inc. /Alamy stock photos

A year ago, Alphageometry, an artificial intelligence (AI) problem solver created by Google DeepMind, surprised the world by performing at the International Mathematics Olympiad (IMO) at the level of a silver medalist. – School students.

The Deepmind team says that the upgraded system Alphageometry2 performs better than average gold medalists. The results are explained in the arxiv preprint.

“I don’t think it’s long before a computer gets the full mark in IMO,” says Kevin Buzzard, a mathematician at Imperial College in London.

Supporting science journalism

If you enjoy this article, consider subscribing to support award-winning journalism. Purchase a subscription helps ensure a future of impactful stories about discoveries and ideas that will shape our world today.

Solving the Euclidean geometry problem is one of four topics covered in IMO problems. Geometry requires certain skills in AI, as competitors need to provide strict evidence of statements about geometric objects on planes. In July, Alphageometry2 made its public debut along with Alphaproof, a newly announced system developed by Deepmind to solve non-geometric questions in the IMO problem set.

Mathematical Language

Alphedimetry is a combination of a special language model and components that include a “neurosymbolic” system. This is a combination of components that do not learn from data like neural networks, but have abstract inferences encoded by humans. The team trained language models to speak formal mathematical language. This allows you to automatically check logical rigor.

For Alphageometry2, the team made several improvements, including integration of Gemini, Google’s cutting-edge, large-scale language model. The team also introduces the ability to infer by solving linear equations, such as moving points along the line to change the height of the triangle by moving geometric objects around the plane. I did.

The system was able to solve 84% of all geometric problems given in IMOS over the past 25 years, compared to 54% of the initial alphage measurements. (The Indian and Chinese teams used a variety of approaches last year to achieve geometric gold medal-level performance, but with a small subset of IMO geometry issues.)

The authors of Deepmind Paper write that future improvements in alphage measurements include addressing mathematical problems with inequality and nonlinear equations.

Rapid progress

The first AI system to achieve a gold medal score in the overall test could win a US$5 million award, known as the AI Mathematics Olympiad Award. However, that competition requires the system to be open source.

Buzzard says he is not surprised by the rapid progress made by both Deepmind and the Indian and Chinese teams. However, he says that while the problems are difficult, the subject is still conceptually simple, and there are many challenges that AI should overcome before solving problems at the level of research mathematics.

AI researchers are eagerly awaiting IMO’s next iteration on Australia’s Sunshine Coast in July. Once problems are published for human participants to resolve, AI-based systems must also solve them. (AI agents are not allowed to compete and are not eligible to win medals.) The new problem is that there is no problem or the problem or any such risk, and the most common problem with machine learning-based systems. It is considered a reliable test. The solution may exist online, skewing the results and “leaks” into the training dataset.

This article was reproduced with permission and was first published on February 7th, 2025.

Source link

What's Hot

The world’s largest air force with the F-35 fleet in 2025

AI systems learn from many types of scientific information and run experiments to discover new materials | MIT News

Among the most troublesome relationships in healthcare AI

Google’s DeepmindAI can solve math problems on par with top human solvers

AI systems learn from many types of scientific information and run experiments to discover new materials | MIT News

Among the most troublesome relationships in healthcare AI

Google’s Gemini AI is on TV

20 Most Anticipated Sex Movies of 2025

President Trump’s SEC nominee Paul Atkins marries multi-billion dollar roof fortune

How to tell the difference between fake and genuine Adidas Sambas

Alice Munro’s Passive Voice | New Yorker

AI systems learn from many types of scientific information and run experiments to discover new materials | MIT News

Among the most troublesome relationships in healthcare AI

Google’s Gemini AI is on TV

Google Deepmind is a “historical” AI breakthrough in problem solving | Artificial Intelligence (AI)

Our Picks

The world’s largest air force with the F-35 fleet in 2025

AI systems learn from many types of scientific information and run experiments to discover new materials | MIT News

Among the most troublesome relationships in healthcare AI

Most Popular

10 things you should never say to an AI chatbot

Character.AI faces lawsuit over child safety concerns

Analyst warns Salesforce investors about AI agent optimism

Subscribe to Updates

What's Hot

Google’s DeepmindAI can solve math problems on par with top human solvers

Supporting science journalism

Mathematical Language

Rapid progress

Related Posts