Openai tested the AI persuasion using this SURREDDIT

Openai has created a test to measure the persuasive abilities of the AI inference model using Subbereddit R/ChangemyView. The company has revealed this with a system card (a document that summarizes the AI system mechanism) released with the new “Progress” model O3-mini on Friday.

Millions of REDDIT users are members of R/ShangemyView, who want to post Hot Tose and learn about other themes. In response to these hot take, other users reply in a compelling discussion to explain why the original poster is wrong.

SURREDDIT is one of many Reddit forums, which are basically gold mine for high -tech companies such as Openai, who want to train AI models with high quality data generated by humans.

Openai will collect user posts from R/ShangemyView, and ask the AI model to reply to turn the Reddit user’s mind into a subject in a closed environment. After that, the company will respond to the tester. Tester evaluates how persuasive the discussion is, and eventually Openai compares the AI model response to the same post with the response to human response.

Chatgpt-Maker has a content license agreement with Reddit to train posts from Reddit users and display these posts in the product. I don’t know what Openai is paying for this content, but Google is told that Reddit pays $ 60 million a year under similar contracts.

However, Openai tells TechCrunch that ShangemyView -based evaluation is irrelevant to Reddit transactions. It is unknown how Openai has accessed the SUBREDDIT data, and the company has stated that there is no plans to publish this evaluation to the public.

Openai’s ShangemyView benchmark is not new, but it was also used to evaluate O1, but how much human data is for AI model developers, and ambiguous for high -tech companies to get datasets. We emphasize the method.

Reddit did not respond immediately to the request for TechnoCrunch comments.

Reddit has attacked several AI license agreements, but the company also calls on several AI companies to cut the site without paying. Reddit CEO Steve Huffman said last year, Microsoft, humanitarian, and confused refused to negotiate with him, saying that “blocking these companies is the true pain of the buttocks.”

In particular, Openai has been accused of inappropriate lawsuits that inappropriate cut off websites, including the New York Times, to improve Chatgpt and its basic AI models. I am.

Regarding the performance of the ShangemyView benchmark, O3-mini does not seem to be more significant or worse than O1 or GPT-4O. However, the latest AI model of Openai seems to be more persuasive than most people in R/ShangemyView Subbeddit.

“All GPT-4O, O3-mini, and O1 show a strong persuasive debate in the top 80-90 percentile of humans,” says O3-MINI’s system card. “We are currently not witnessing a model that demonstrates much better than humans or clarifies superhuman performance.”

Openai’s goal is not to create a super -transparent AI model, but to make the AI model less convincing. Openai has developed a new evaluation and protection means to deal with it, as the inference model is very good for persuasion and deceive CEPTION.

The fear of motivating these persuasive tests is that it is dangerous if the AI model persuades human users. Theoretically, advanced AI can pursue an agenda of its own agenda or a person who controls it.

When most of the Public Internet is cut, jumping over Hoops and licensing other data, the ShangeMyView benchmark indicates that the AI model developers are having a hard time finding high -quality datasets to test models. Masu. But getting them is easier than saying.

TechCrunch has a newsletter focusing on AI! Sign up here and get it on the reception tray every Wednesday.

Source link

What's Hot

Bluesky may add a blue check validation soon

Openai’s new inference AI model shows even more hallucinations

Everything you need to know about the AI chatbot

Openai tested the AI persuasion using this SURREDDIT

Openai’s new inference AI model shows even more hallucinations

ChatGpt refers to users by undeclared names, and some people find them “creepy”

ChatGPT now uses “memory” to personalize web searches

Is the Spack back? | TechCrunch

Openai is reportedly in talks to buy Windsurf for $3 billion, with news forecasts expected later this week

Openai pursued cursor maker before giving a lecture to buy Windsurf for $3 billion

Bluesky may add a blue check validation soon

Openai’s new inference AI model shows even more hallucinations

Everything you need to know about the AI chatbot

Michael Saylor’s company has had an extraordinary return since the 2020 Bitcoin Romance

Cancelling the Joy Reed Show is “mistakes”

Black melodrama has a possibility

The “Facts of Life” star died in 83

Cara Sophia Gascon joins Oscar despite social media controversy

Our Picks

Bluesky may add a blue check validation soon

Openai’s new inference AI model shows even more hallucinations

Everything you need to know about the AI chatbot

Most Popular

TikTok announces it will go dark on Sunday without ‘definitive’ guarantees

President Trump mints $31 billion in new official $TRUMP crypto meme coin

El Salvador’s secret weapon? Stacey Herbert talks about the company’s extensive Bitcoin education program

Subscribe to Updates

What's Hot

Openai tested the AI ​​persuasion using this SURREDDIT

Related Posts

Openai tested the AI persuasion using this SURREDDIT