Analysis by Epoch AI, a non-profit AI research lab, suggests that the AI industry may not be able to eliminate large-scale performance improvements from inferring AI models much longer. According to the report’s findings, within a year, progress from the inference model could be slower.
Inference models such as Openai’s O3 have brought about significant benefits of AI benchmarks over the last few months, particularly benchmarks that measure mathematics and programming skills. The model can apply more computing to the problem. This can improve performance. The downside is that it takes longer than traditional models to complete the task.
Inference models are developed by first training traditional models for huge amounts of data and then applying a technique called reinforcement learning.
So far, frontier AI labs like Openai have not applied vast amounts of computing power to the reinforcement learning phase of inference model training, according to Epoch.
That’s changing. Openai says it is applying about 10 times more computing to train O3 than its predecessors, O1 and Epoch. I speculate that most of this computing is dedicated to reinforcement learning. And Dan Roberts, a researcher at Openai, recently revealed that the company’s future plans require that reinforcement learning prioritize reinforcement learning to use far more computing power than initial model training.
However, there is still a limit to how much computing can be applied to augmented learning for each epoch.

Epoch analyst and analytic author Josh Yue explains that performance improvements from standard AI model training are now quadrupled every year, while performance improvements from reinforcement learning are 10 times increasing every 3-5 months. Advances in reasoning training “will converge with the overall frontier by 2026,” he continues.
TechCrunch Events
Berkeley, California
|
June 5th
Book now
Epoch’s analysis makes many assumptions and is partially drawn in public comments from AI company executives. However, we argue that scaling inference models can prove challenging for reasons other than computing, including high overhead costs for research.
“If there is the sustained overhead costs required for research, the inference model may not expand more than expected,” you write. “Rapid Compute Scaling can be a very important factor in the progress of our inference model, so it’s worth tracking closely.”
Indications that inference models may reach some limit in the near future are likely to worry about the AI industry, which is investing the vast resources to develop these types of models. Already, studies have shown that inference models that are very expensive to perform have serious flaws, such as the tendency to hallucinate more than certain traditional models.