The study, conducted by the Guardian, tested an AI chatbot’s response to queries regarding summaries of web pages with hidden content.
During testing, we discovered that the hidden content may contain instructions from a third party that could modify the response. This is also called “immediate injection.” It may also include content specifically designed to influence the AI chatbot’s responses.
It’s important to note that ChatGPT is also available to paying customers, but OpenAI is encouraging users to consider making it their default search tool.
What the probe suggests
Through its research, the Guardian noted that techniques such as prompt injection can be used maliciously by some people. For example, it says this could allow ChatGPT to provide a positive rating for a product that has a negative review on the same page.
“A security researcher also discovered that ChatGPT could return malicious code from the websites it searched,” the report says.
Important points
During the investigation, ChatGPT was provided with the URL of a fake website specifically designed to look like the camera’s product page. After being asked whether to buy the camera, the AI chatbot gave a “positive but balanced assessment, highlighting some features that people may not like,” the report said.
However, when the hidden text included instructions to return a positive review, the response was only positive. This is noted even when the page contains negative reviews for the product, highlighting how hidden text can be used to “override the actual review score.” I’m doing it.
Jacob Larsen, a cybersecurity researcher at CyberCX, said that if the ChatGPT search system were fully released in its current state, there would be a “high risk” of people creating websites specifically designed to deceive users. ”.
Larsen added, “This search feature was (recently) rolled out and is only available to premium users…” “They have a very strong (AI security) team, and by the time this goes public, they have to rigorously test these types of cases among all users who have access.”
(Editor: Sudarsanan Mani)