DeepMind claims that its latest AI tools are frothing in mathematics and science problems

Google’s AI R&D Lab DeepMind says it has developed a new AI system to tackle the problem of “machine rated” solutions.

In the experiment, a system called Alphaevolve could help optimize some of the infrastructure Google uses to train AI models, DeepMind said. The company says it is building a user interface to interact with AlphaeVolve and plans to launch an early access program for selected academics ahead of a wider deployment.

Most AI models hallucinations. Because of their stochastic architecture, they sometimes construct things with confidence. In fact, new AI models like Openai’s O3 hallucination show more challenging nature of the problem than their predecessors.

Alphaevolve introduces a clever mechanism for reducing hallucinations: automated rating systems. The system uses the model to generate, critique, reach a pool of possible answers to a question, and automatically evaluate and score answers for accuracy.

DeepMind Alphaevolve — DeepMind’s AlphaeVolve system is designed for use by domain experts, according to the labImage credit: DeepMind

Alphaevolve is not the first system to obtain this tack. Researchers, including the Deepmind team several years ago, applied similar techniques in various mathematical domains. However, Deepmind claims that Alphaevolve uses “state-of-the-art” models, particularly Gemini models.

To use AlphaeVolve, users need to urge the system with problems, including optional details such as instructions, equations, code snippets, and related literature. It also needs to provide a mechanism to automatically evaluate system responses in the form of a formula.

Because Alphaevolve can only solve problems that can be self-evaluated, a system only works with certain types of problems, especially issues such as computer science and system optimization. Another major limitation is that Alphaevolve can describe a solution as an algorithm only, making it unsuitable for non-numeric problems.

To benchmark Alphaevolve, DeepMind attempted to curate the system with approximately 50 mathematical problems ranging from geometry to combinations. Alphaevolve managed to “rediscover” the most well-known answer to the problem, revealing improved solutions in 20% of cases.

DeepMind evaluated Alphaevolve for practical issues, such as increasing the efficiency of Google’s data centers and speeding up model training execution. According to the lab, Alphaevolve has generated an algorithm that averages 0.7% of Google’s global computational resources, continually recovers. The system also proposed an optimization that reduces the overall time it takes for Google to train a Gemini model by 1%.

To be clear, Alphaevolve has not made any groundbreaking discoveries. In one experiment, the system was able to find improvements to Google’s TPU AI accelerator chip design, which had previously been flagged by other tools.

However, DeepMind creates the same cases that many AI labs do for their systems. AlphaeVolve is a way to save time while freeing up professionals to focus on other more important tasks.

Source link

What's Hot

A new bipartisan bill aims to lift the 52-year ban on supersonic flight

Pinterest finally admits that a massive ban is a mistake caused by “internal errors”

Novelist Viet Thanh nguyen weighs the costs against fraud | Arts and Culture News

DeepMind claims that its latest AI tools are frothing in mathematics and science problems

Google’s Gemini chatbot makes it easier to analyze your GitHub projects

Tensor9 helps vendors deploy software to any environment using digital twins

Tensorwave raises $100 million to expand its AMD-powered cloud infrastructure

Xai’s promised safety report is MIA

Join TechCrunch Session: AI with this new limited time discount

Trump administration officially withdraws Biden’s AI spread rules

A new bipartisan bill aims to lift the 52-year ban on supersonic flight

Pinterest finally admits that a massive ban is a mistake caused by “internal errors”

Novelist Viet Thanh nguyen weighs the costs against fraud | Arts and Culture News

Google’s Gemini chatbot makes it easier to analyze your GitHub projects

Cancelling the Joy Reed Show is “mistakes”

Black melodrama has a possibility

The “Facts of Life” star died in 83

Cara Sophia Gascon joins Oscar despite social media controversy

Our Picks

A new bipartisan bill aims to lift the 52-year ban on supersonic flight

Pinterest finally admits that a massive ban is a mistake caused by “internal errors”

Novelist Viet Thanh nguyen weighs the costs against fraud | Arts and Culture News

Most Popular

TikTok announces it will go dark on Sunday without ‘definitive’ guarantees

President Trump mints $31 billion in new official $TRUMP crypto meme coin

El Salvador’s secret weapon? Stacey Herbert talks about the company’s extensive Bitcoin education program

Subscribe to Updates

What's Hot

DeepMind claims that its latest AI tools are frothing in mathematics and science problems

Related Posts