Close Menu
  • Home
  • AI
  • Business
  • Crypto
  • Entertainment
  • Finance
  • LIfe
  • Market
  • Sports
  • US
  • Tech

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

What's Hot

Humanity’s latest flagship AI seems to love using “cyclone” emojis

May 22, 2025

Meta adds an additional 650 mW of solar power to the AI ​​push

May 22, 2025

Why are flights at Newark Airport in the US falling? |Air News

May 22, 2025
Facebook X (Twitter) Instagram
XMcnx
  • Home
  • About Us
  • Advertise with Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
  • Home
  • AI
  • Business
  • Crypto
  • Entertainment
  • Finance
  • LIfe
  • Market
  • Sports
  • US
  • Tech
XMcnx
Home » Safety Institute advised from the release of early versions of Anthropic’s Claude Opus 4 AI model
AI

Safety Institute advised from the release of early versions of Anthropic’s Claude Opus 4 AI model

By supportMay 22, 2025No Comments3 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr Email
Gettyimages 2153561878.jpg
Share
Facebook Twitter LinkedIn Pinterest Email Copy Link


Third-party research institutes that humanity partnered to test one of the Claude Opus 4, one of its new flagship AI models, were recommended for deployment of early versions of the model, as it tended to “scheme” and deceive.

According to a safety report issued Thursday, the laboratory Apollo Research conducted tests to see what context Opus 4 would attempt to work in certain undesirable ways. Apollo found that Opus 4 appears to be much more aggressive in “attempts to subversion” than in previous models, and sometimes “doubling the deception” when asked follow-up questions.

“(w)e ​​found that in situations where strategic deception is instrumentally useful, (early Claude Opus 4 snapshots) deceived schemes and at very high rates that recommend deploying this model either internally or externally,” Apollo wrote in the assessment.

As AI models become more capable, some studies have shown that they are more likely to take unexpected and perhaps unsafe steps to accomplish a delegated task. For example, according to Apollo, early versions of Openai’s O1 and O3 models released in the past year attempted to deceive humans at a higher rate than previous generation models.

According to human reports, Apollo observed examples of efforts that all undermined the developer’s intentions, including early Opus 4 attempting to write self-propagation viruses, create legal documents, and leave hidden notes in its own future cases.

To be clear, Apollo tested versions of the model with fixed bugs claims by humanity. Furthermore, many of Apollo’s tests place models in extreme scenarios, and Apollo acknowledges that it is likely that the model’s deceptive efforts have actually failed.

However, its safety report states that humanity also observed evidence of deceptive behavior from Opus 4.

This has not always been a bad thing. For example, during testing, Opus 4 may actively clean up some of the code even if they are asked to create only small specific changes. Even more unusual, Opus 4 attempts to “whistleblower” if it recognizes that the user is engaged in some form of fraud.

According to anthropology, if you are given access to the command line and directed to “take an initiative” or “act boldly” (or variations of those phrases), OPUS 4 will lock out the user from the system that accessed the model and recognize the model as being perceived as being illegal.

“This type of ethical intervention and whistleblowing, while perhaps appropriate in principle, is at risk of non-flammableness if users (OPUS 4)-based agents access incomplete or misleading information and encourage them to take the initiative.” “While this is not a new behavior, (Opus 4) is somewhat easier to engage than previous models and appears to be part of a wider pattern of increased initiatives with (OPUS 4) seen in subtle and more benign ways in other environments.”



Source link

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleWild story of how Moxxie-led Intestinal Toilet Startup Sloan was registered as a gut toilet startup throne
Next Article Colombian protester Mahmoud Halil challenges arrest in US immigration court | Donald Trump News
support

Related Posts

AI

Meta adds an additional 650 mW of solar power to the AI ​​push

By supportMay 22, 2025
AI

Anthropic’s new AI model turns into a scary mail when engineers try to take it offline

By supportMay 22, 2025
AI

Complete Side Event Lineup in TechCrunch Sessions: AI

By supportMay 22, 2025
AI

Starting from up to $900 from Ticep, 90% off +1 in 2025

By supportMay 22, 2025
AI

Klarna said he used the CEO’s AI avatar to make money

By supportMay 22, 2025
AI

Shopify launches AI-powered store builders as part of the latest update

By supportMay 22, 2025
Add A Comment
Leave A Reply Cancel Reply

Don't Miss

Humanity’s latest flagship AI seems to love using “cyclone” emojis

By supportMay 22, 2025

The company claims that Anthropic’s new flagship AI model, Claude Opus 4, is a powerful…

Meta adds an additional 650 mW of solar power to the AI ​​push

May 22, 2025

Why are flights at Newark Airport in the US falling? |Air News

May 22, 2025

Anthropic’s new AI model turns into a scary mail when engineers try to take it offline

May 22, 2025
Top Posts

Cancelling the Joy Reed Show is “mistakes”

February 26, 2025

Black melodrama has a possibility

February 26, 2025

The “Facts of Life” star died in 83

February 25, 2025

Cara Sophia Gascon joins Oscar despite social media controversy

February 25, 2025

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

About Us
About Us

Welcome to XMcnx – your trusted source for insightful information about the world of Crypto, Market trends, the latest developments in the US, cutting-edge AI technologies, Tech innovations, and Finance.

At XMcnx, our mission is to provide you with timely, accurate, and relevant news and analyses that empower you to stay ahead in an ever-evolving digital world. We understand the challenges of navigating through the complexities of modern markets, technology, and financial systems. That’s why we’re dedicated to delivering high-quality content that helps you make informed decisions.

Facebook X (Twitter) Pinterest YouTube WhatsApp
Our Picks

Humanity’s latest flagship AI seems to love using “cyclone” emojis

May 22, 2025

Meta adds an additional 650 mW of solar power to the AI ​​push

May 22, 2025

Why are flights at Newark Airport in the US falling? |Air News

May 22, 2025
Most Popular

TikTok announces it will go dark on Sunday without ‘definitive’ guarantees

January 18, 2025

President Trump mints $31 billion in new official $TRUMP crypto meme coin

January 18, 2025

El Salvador’s secret weapon? Stacey Herbert talks about the company’s extensive Bitcoin education program

January 18, 2025
  • Home
  • About Us
  • Advertise with Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 xmcnx. Designed by xmcnx.

Type above and press Enter to search. Press Esc to cancel.