Close Menu
  • Home
  • AI
  • Business
  • Crypto
  • Entertainment
  • Finance
  • LIfe
  • Market
  • Sports
  • US
  • Tech

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

What's Hot

Waymo Robotaxis is being pushed into more California cities

June 18, 2025

Threads extend open social web integration with Fediverse feeds, user profile search

June 18, 2025

Pro-Israel Hacktivist Group has allegedly blamed for alleged Iranian bank hacks

June 18, 2025
Facebook X (Twitter) Instagram
XMcnx
  • Home
  • About Us
  • Advertise with Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
  • Home
  • AI
  • Business
  • Crypto
  • Entertainment
  • Finance
  • LIfe
  • Market
  • Sports
  • US
  • Tech
XMcnx
Home » Openai’s new inference AI model shows even more hallucinations
AI

Openai’s new inference AI model shows even more hallucinations

By supportApril 18, 2025No Comments4 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr Email
Openai Pattern 03.jpg
Share
Facebook Twitter LinkedIn Pinterest Email Copy Link


Openai’s recent launch of the O3 and O4-MINI AI models are cutting edge in many ways. However, the new model is still hallucinating or compensates for things. In fact, it prevents more hallucinations than some of the older models of Openai.

Hallucinations have proven to be one of the biggest and most difficult problems to solve with AI, and have also impacted today’s best performance systems. Historically, each new model has slightly improved in the hallucination sector, with fewer hallucinations than its predecessors. However, that doesn’t seem to be the case for O3 and O4-mini.

According to Openai’s internal tests, the so-called inference models, O3 and O4-Mini hallucinates more frequently than the company’s previous inference models (O1, O1-Mini, and O3-Mini), as well as Openai’s traditional “non-seasonal” models, such as GPT-4O.

Perhaps more concerning, ChatGpt makers really don’t know why it’s happening.

In the technical report of O3 and O4-Mini, Openai writes that “more research is needed” to understand why hallucinations are getting worse as they expand their inference models. O3 and O4-MINI improve performance in some areas, including coding and mathematics-related tasks. However, they often make “more accurate and more inaccurate/hastique claims” because they “make more claims overall,” according to the report.

Openai found that O3 hallucinated in response to 33% of questions about Personqa, the company’s internal benchmark for measuring the accuracy of model knowledge about people. This is almost twice the hallucination rates of Openai’s previous inference models, O1 and O3-Mini, which achieved 16% and 14.8%, respectively. The O4-Mini got even worse with Personqa.

Third-party testing by Transluse, a non-profit AI Research Lab, also found evidence that it tends to constitute the actions O3 took in the process of reaching the answer. In one example, I translated the observed O3, claiming I ran the code in “other than ChatGPT” on my 2021 MacBook Pro, and copied the numbers into my answer. O3 has access to several tools, but that is not possible.

“Our hypothesis is that the type of reinforcement learning used in the O-series model can amplify problems that are normally mitigated (not completely erased) by standard post-training pipelines,” said Neil Chowdhury, a researcher and former Openai employee, in an email to TechCrunch.

Transluse co-founder Sarah Schwettmann added that the hallucination rate of O3 is not very useful otherwise.

Kian Katanforoosh, adjunct professor and CEO of Stanford University, is CEO of luxury startup Workara, who told TechCrunch that his team has already tested O3 in their coding workflows and has found that they are beyond the top of their competitors. However, Katanforoosh says that O3 tends to hallucinate broken website links. The model provides links that do not work when you click.

Hallucinations may help models arrive at interesting ideas and become creative in “thinking,” but some models can also be tough sellers for companies in the market where accuracy is paramount. For example, a law firm may not be satisfied with a model that inserts many de facto errors into client contracts.

One promising approach to increasing model accuracy is to provide web search capabilities. Openai’s GPT-4o using web search achieves 90% accuracy with SimpleQA. Potentially, searches could improve hallucination rates in inference models, at least when users are willing to expose their prompts to third-party search providers.

Scaling up the inference model makes the hunt for solutions even more urgent as the hallucinations actually continue to worsen.

“Countering hallucinations in all models is an ongoing field of research, and we are constantly working to improve its accuracy and reliability,” said Openai spokesman Niko Felix in an email to TechCrunch.

Last year, the broader AI industry began to show a decline in returns, focusing on inference models after techniques to improve traditional AI models. Inference improves model performance for various tasks without the need for enormous amounts of computing and data during training. However, it also appears that reasoning can lead to more hallucinations – presenting a challenge.



Source link

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleEverything you need to know about the AI chatbot
Next Article Bluesky may add a blue check validation soon
support

Related Posts

AI

Pope Leo makes AI’s threat to humanity a signature issue

By supportJune 18, 2025
AI

Sam Altman says Meta tried to poach open eye talent with a $100 million offer

By supportJune 18, 2025
AI

Police have closed Cluely’s party, “All Cheats” startup

By supportJune 18, 2025
AI

Google’s Gemini panic when playing Pokemon

By supportJune 17, 2025
AI

Openai’s $200 million DOD deal could narrow down Frenemy Microsoft

By supportJune 17, 2025
AI

Sequoia-backed Crosby launches a new kind of AI-powered law firm

By supportJune 17, 2025
Add A Comment

Comments are closed.

Don't Miss

Waymo Robotaxis is being pushed into more California cities

By supportJune 18, 2025

Waymo is expanding its Robotaxi service area by an additional 80 square miles in Los…

Threads extend open social web integration with Fediverse feeds, user profile search

June 18, 2025

Pro-Israel Hacktivist Group has allegedly blamed for alleged Iranian bank hacks

June 18, 2025

Federal Reserve Interest Rate Decision June 2025: What should you know?

June 18, 2025
Top Posts

Cancelling the Joy Reed Show is “mistakes”

February 26, 2025

Black melodrama has a possibility

February 26, 2025

The “Facts of Life” star died in 83

February 25, 2025

Cara Sophia Gascon joins Oscar despite social media controversy

February 25, 2025

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

About Us
About Us

Welcome to XMcnx – your trusted source for insightful information about the world of Crypto, Market trends, the latest developments in the US, cutting-edge AI technologies, Tech innovations, and Finance.

At XMcnx, our mission is to provide you with timely, accurate, and relevant news and analyses that empower you to stay ahead in an ever-evolving digital world. We understand the challenges of navigating through the complexities of modern markets, technology, and financial systems. That’s why we’re dedicated to delivering high-quality content that helps you make informed decisions.

Facebook X (Twitter) Pinterest YouTube WhatsApp
Our Picks

Waymo Robotaxis is being pushed into more California cities

June 18, 2025

Threads extend open social web integration with Fediverse feeds, user profile search

June 18, 2025

Pro-Israel Hacktivist Group has allegedly blamed for alleged Iranian bank hacks

June 18, 2025
Most Popular

TikTok announces it will go dark on Sunday without ‘definitive’ guarantees

January 18, 2025

President Trump mints $31 billion in new official $TRUMP crypto meme coin

January 18, 2025

El Salvador’s secret weapon? Stacey Herbert talks about the company’s extensive Bitcoin education program

January 18, 2025
  • Home
  • About Us
  • Advertise with Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 xmcnx. Designed by xmcnx.

Type above and press Enter to search. Press Esc to cancel.