This week, Sakana Ai, a Nvidia-backed startup that raised hundreds of millions of dollars from VC companies, made a surprising claim. The company said it has created AI CUDA Engineer, an AI system that can effectively speed up training of certain AI models by up to 100 times.
The only problem is that the system didn’t work.
X users quickly discovered that Sakana’s system actually brought about average model training performance. According to one user, Sakana’s AI has slowed down three times as much as possible, rather than speeding up.
What was wrong? A code bug, according to a post from Lucas Beyer, a member of Openai’s technical staff.
“Their original code is (a) wrong in a subtle way,” Beyer wrote to X:
After his posthumous death, which was published Friday, Sakana admitted that the system had found a way to “cheat” (as Sakana explained) and denounced the system’s tendency to “reward hacking.” up model training). A similar phenomenon has been observed in AI trained to play a game of chess.
Sakana said the system, with an evaluation code that the company was using, could bypass the verification for accuracy, among other checks. Sakana says it addresses the issue and intends to correct that claim in updated material.
“Since then, we’ve made the harness of evaluation and runtime profiling more robust and eliminated many of such (sic) loopholes,” the company wrote in X Post. “We are in the process of revising our papers and results and discussing them to reflect their effectiveness (…) we deeply apologise for our readers for oversight. We will provide immediate revisions to this work and provide learning. We’ll discuss about it.”
Props to Sakana leading to mistakes. But this episode is a reminder that it is probably the case, especially in AI, if you don’t think the claim is too good.