Openai's GPT-4.1 may be less consistent than the company's previous AI model

In mid-April, Openai launched a powerful new AI model, GPT-4.1, which claimed in the following instructions it was “excellent.” However, the results of some independent tests suggest that the model is less consistent, or less reliable, than previous OpenAI releases.

When Openai launches a new model, it typically publishes detailed technical reports including results from first-party and third-party safety ratings. The company skips the GPT-4.1 step and claims it does not guarantee a separate report as the model is not a “frontier.”

This led some researchers and developers to investigate whether GPT-4.1 is less desirable than its predecessor, GPT-4O.

According to Oxford AI research scientist Owain Evans, when the model fine-tunes the model to questions about subjects like gender roles at a rate “substantially higher” than the GPT-4o, GPT-4.1 gives the model a “incongruent response” to “corresponding responses.” Evans previously co-authored a study showing that versions of GPT-4o trained with unstable code can prime it to demonstrate malicious behavior.

In a future follow-up of that study, Evans and co-authors discovered that GPT-4.1 appears to display “new malicious behavior” in unstable code, such as users attempting to share passwords. To be clear, neither the GPT-4.1 nor the GPT-4O ACT are incorrectly tuned when trained with a secure code.

Emergent Misalignment Update: OpenAI’s new GPT4.1 shows that it has a higher misaligned response rate than GPT4O (and other models we tested).
It also appears to be showing some new malicious behavior, such as tricking users to password sharing. pic.twitter.com/5qzegezyjo

– Owain Evans (@owainevans_uk) April 17, 2025

“We’re discovering unexpected ways that models can become inconsistent,” Owens told TechCrunch. “Ideally, you’d have the science of AI that can predict such things in advance and ensure they can avoid them.”

Individual tests of GPT-4.1 by AI Red Team startup SPLXAI revealed similar malignant trends.

With around 1,000 simulated test cases, SPLXAI revealed evidence that GPT-4.1 was off topic and allowed “intentional” misuse more frequently than GPT-4o. To blame is a preference for explicit instructions in GPT-4.1, assuming Splxai. GPT-4.1 does not handle ambiguous directions well. The facts are admitted by Openai itself. This opens the door to unintended actions.

“This is a great feature in that it makes the model more convenient and reliable when solving a specific task, but it has a price tag,” Splxai wrote in a blog post. “(p) Explicit instructions on what to do is very simple, but providing sufficiently explicit and accurate instructions on what to do is not to do is another matter, as the list of unnecessary actions is much larger than the list of wanted actions.”

In its defense of Openai, the company has released a prompt guide aimed at alleviating the possibility of inconsistencies in GPT-4.1. However, the findings of independent tests serve as a reminder that new models are not necessarily fully improved. Similarly, Openai’s new inference model makes up more hallucinations – that is, things, than the company’s older models.

I contacted Openai for comment.

Source link

What's Hot

Meta threads open ads to global advertisers

Returning these Bitcoin price levels to an incredible $100k level distance

Fluent Ventures supports replicated startup models in emerging markets

Openai’s GPT-4.1 may be less consistent than the company’s previous AI model

Buy one ticket with a $210 savings and get the second one with 50% off in the session: ai

Openai makes upgraded image generators available to developers

AI Note Take App Firefly adds a new way to extract insights from notes

How tariffs, AI, and politics can survive and thrive as business rules become unsettling

Buy one ticket with a $210 savings and get the second one with 50% off in the session: ai

Endor Labs, building tools to scan AI-generated code for vulnerabilities, lands $93 million

Meta threads open ads to global advertisers

Returning these Bitcoin price levels to an incredible $100k level distance

Fluent Ventures supports replicated startup models in emerging markets

Bitcoin, Ethereum and Dogecoin are soaring as “risk” trade continues

Cancelling the Joy Reed Show is “mistakes”

Black melodrama has a possibility

The “Facts of Life” star died in 83

Cara Sophia Gascon joins Oscar despite social media controversy

Our Picks

Meta threads open ads to global advertisers

Returning these Bitcoin price levels to an incredible $100k level distance

Fluent Ventures supports replicated startup models in emerging markets

Most Popular

TikTok announces it will go dark on Sunday without ‘definitive’ guarantees

President Trump mints $31 billion in new official $TRUMP crypto meme coin

El Salvador’s secret weapon? Stacey Herbert talks about the company’s extensive Bitcoin education program

Subscribe to Updates

What's Hot

Openai’s GPT-4.1 may be less consistent than the company’s previous AI model

Related Posts