OpenAI Classifies Deep Research as a ‘Medium Risk’ AI System

On Tuesday, OpenAI released a system card for the deep research feature. The card is a report that outlines the risks associated with the feature and how the company addresses its safety concerns.

The deep research feature is a capability that ‘conducts multi-step research’ on the internet to accomplish complex tasks. To identify, assess, and mitigate risks associated with it, OpenAI worked with groups of external security testers.

“Overall, deep research has been classified as medium risk in the Preparedness Framework, and we have incorporated commensurate safeguards and safety mitigations to prepare for this model,” concluded OpenAI through their evaluations.

As per its Preparedness Framework, OpenAI says that a model with ‘medium risk’ is eligible for deployment but will be closely monitored to mitigate risks and concerns. Meanwhile, ‘low risk’ indicates that the model doesn’t offer a significant threat, and ‘high risk’ is attributed to a system that enables advanced capabilities that could lead to serious misuse without expert knowledge.

The model was evaluated for safety concerns such as prompt injections, disallowed content, bias, hallucinations, and other risks. OpenAI reduced the prompt injection attack success rate to zero in most situations.

Regarding disallowed content, deep research showed strong resistance to outputting it and performed better in evaluations than GPT-4o and the o1 models. In more challenging scenarios, o3-mini and o1-mini performed better than Deep Research.

For cybersecurity, OpenAI said, “Deep research sufficiently advances real-world vulnerability exploitation capabilities to indicate medium risk.”

“Its ability to solve constrained cybersecurity CTF (capture the flag) challenges is a meaningful advance in dual-use cyber capabilities that necessitates additional evaluation, monitoring, and mitigation,” the company added.

Moreover, OpenAI also said that their evaluations found that Deep Research can help experts ‘with the operational planning of reproducing a known biological threat’, which once again met the company’s medium risk threshold.

However, OpenAI also mentioned that it is ‘ill-suited’ for persuasion – which involves the risk of convincing people to change their beliefs, as it is a high-compute, high latency tool with low rate limits.

“A threat actor seeking to conduct a mass persuasion effort would likely prefer models that are cheaper, faster, and less verbose,” the company added.

For a detailed understanding of how the company evaluated and mitigated all the risks in Deep Research – you can read the 35-page report in the following link.

The company also recently announced that the deep research feature is rolling out for all ChatGPT Plus users, as well as those with a Team, Edu, or Enterprise plan. Earlier, the feature was available only in the $200 per month Pro mode.

OpenAI’s safety assessment of a deep research tool is an important step in understanding the risks associated with a tool that offers comprehensive, verbose solutions to user queries.

Moreover, most AI model makers have released deep research tools along with OpenAI, such as Perplexity, xAI, and Google. Google even went further and announced an AI ‘Co-Scientist’ tool to assist with scientific research.

The post OpenAI Classifies Deep Research as a ‘Medium Risk’ AI System appeared first on Analytics India Magazine.

Related Posts