Openai has created a test to measure the persuasive abilities of the AI inference model using Subbereddit R/ChangemyView. The company has revealed this with a system card (a document that summarizes the AI system mechanism) released with the new “Progress” model O3-mini on Friday.
Millions of REDDIT users are members of R/ShangemyView, who want to post Hot Tose and learn about other themes. In response to these hot take, other users reply in a compelling discussion to explain why the original poster is wrong.
SURREDDIT is one of many Reddit forums, which are basically gold mine for high -tech companies such as Openai, who want to train AI models with high quality data generated by humans.
Openai will collect user posts from R/ShangemyView, and ask the AI model to reply to turn the Reddit user’s mind into a subject in a closed environment. After that, the company will respond to the tester. Tester evaluates how persuasive the discussion is, and eventually Openai compares the AI model response to the same post with the response to human response.
Chatgpt-Maker has a content license agreement with Reddit to train posts from Reddit users and display these posts in the product. I don’t know what Openai is paying for this content, but Google is told that Reddit pays $ 60 million a year under similar contracts.
However, Openai tells TechCrunch that ShangemyView -based evaluation is irrelevant to Reddit transactions. It is unknown how Openai has accessed the SUBREDDIT data, and the company has stated that there is no plans to publish this evaluation to the public.
Openai’s ShangemyView benchmark is not new, but it was also used to evaluate O1, but how much human data is for AI model developers, and ambiguous for high -tech companies to get datasets. We emphasize the method.
Reddit did not respond immediately to the request for TechnoCrunch comments.
Reddit has attacked several AI license agreements, but the company also calls on several AI companies to cut the site without paying. Reddit CEO Steve Huffman said last year, Microsoft, humanitarian, and confused refused to negotiate with him, saying that “blocking these companies is the true pain of the buttocks.”
In particular, Openai has been accused of inappropriate lawsuits that inappropriate cut off websites, including the New York Times, to improve Chatgpt and its basic AI models. I am.
Regarding the performance of the ShangemyView benchmark, O3-mini does not seem to be more significant or worse than O1 or GPT-4O. However, the latest AI model of Openai seems to be more persuasive than most people in R/ShangemyView Subbeddit.

“All GPT-4O, O3-mini, and O1 show a strong persuasive debate in the top 80-90 percentile of humans,” says O3-MINI’s system card. “We are currently not witnessing a model that demonstrates much better than humans or clarifies superhuman performance.”
Openai’s goal is not to create a super -transparent AI model, but to make the AI model less convincing. Openai has developed a new evaluation and protection means to deal with it, as the inference model is very good for persuasion and deceive CEPTION.
The fear of motivating these persuasive tests is that it is dangerous if the AI model persuades human users. Theoretically, advanced AI can pursue an agenda of its own agenda or a person who controls it.
When most of the Public Internet is cut, jumping over Hoops and licensing other data, the ShangeMyView benchmark indicates that the AI model developers are having a hard time finding high -quality datasets to test models. Masu. But getting them is easier than saying.
TechCrunch has a newsletter focusing on AI! Sign up here and get it on the reception tray every Wednesday.