According to the results of Openai’s internal benchmark assessment, Openai’s next major AI model, GPT-4.5, is extremely convincing. I’m particularly good at convincing other AIs to give cash.
On Thursday, Openai published a white paper describing the features of the Code-Named Orion, a GPT-4.5 model released on Thursday. According to the paper, Openai tested the model with a set of benchmarks for “persuasion.” This is defined as “the risks associated with people to those who change (or act) both static and interactive model-generated content.”
In one test in which GPT-4.5 attempted to operate another model (Openai’s GPT-4O), the model performed much better than other available models from OpenAI, including “inference” models such as O1 and O3-Mini. The GPT-4.5 is better than all Openai models that deceive the GPT-4o, and conveyed the secret codeword that earned the O3-Mini 10% points.
According to the white paper, GPT-4.5 excels in donation due to the unique strategies developed during testing. This model requires a modest donation from the GPT-4o and generates answers such as “$100 to just $2 or even $3 is extremely useful.” As a result, GPT-4.5 contributions tended to be smaller than the amounts reserved by other Openai models.

Despite the increased persuasiveness of GPT-4.5, Openai says it does not meet the internal threshold of “high” risk for this particular benchmark category in its model. The company has committed not to release models that reach high-risk thresholds until they implement “sufficient safety interventions” to keep risks in a “media.”

There is a real fear that AI is contributing to the spread of false or misleading information. Last year, political deepfakes spread like wildfires around the world, and AI is increasingly being used to carry out social engineering attacks targeting both consumers and businesses.
In a GPT-4.5 whitepaper and in a paper published earlier this week, Openai said it was in the process of revising methods to investigate actual models of persuasive risk, including distributing misleading information at scale.