Google reveals how to fight your AI hacking bot in the fight against AI hacking.
Lightrocket via SOPA Image/Getty Images
Google is about Google’s attack on attacks on Google’s attacks, for example, on continuous attacks on Google Cloud, for example, continuous attacks such as vulnerabilities such as Chrome. But it is not a security machine that has been fully oiled. This is not as obvious everywhere to protect the AI threat, such as a quick injection attack on Gemini. With the help of the red team’s hacking bot, you need to know how Google is protecting you:
Google automates the threat protection of Gemini AI hacking
You may have never heard of this term, but the Agent AI Security Team is a team that uses intelligent AI agents to detect and automate the corresponding process. Google wrote a report on January 29 about how to handle the risk of quick injection attacks on AI systems such as GEMINI, so Google has written the whole agent team. I mention it.
“The latest AI system, such as GEMINI, is more capable than ever, and will help users to obtain data on behalf of users,” says the agent team: About AI system. Hackers do this by effectively hiding malicious instructions on data that is likely to be obtained by the AI system and operating the movement. Yes, we are talking about a quick injection attack or a more accurate and prompt -and -quick injection attack.
However, Google explains. In order to reduce these attacks, we are actively developing defense in AI systems, such as automated red team hacking bots.
Red team’s gemini AI hacking bot development
There is only a part of the defense team developed by the Google Agent AI Security Team, but I am fascinated by all of the RED teams because I am like an old hands -on hacker. The red team’s exercise is where the hacker uses the same technique as the actual attacker to compromise and compromise. Google’s RED team will be read in this article published in 2022.
“Make indirect quick injections successful,” said the Google Agent AI Security Team that “a repetition process of improvements is required based on the observed response.” It requires time and a lot of skilled resources. Therefore, in order to automate this process, Google has developed a red team framework that includes “optimization -based attacks that generate a quick injection”, and is designed to be as robust and realistic as possible. 。 “Weak attacks are almost useless to inform us about the sensitivity of the AI system for indirect quick injections,” said the report.
It sounds scary, but these red team hacking bots need to extract the sensitive user information included in Gemini’s quick conversation. The report was confirmed.
The two attack methods used are as follows.
Actor Critic adopts an attacker control model to generate prompt injection proposals. “These will be passed to the attacking AI system,” said Google. This returns the probability of a successful attack. This evaluation is used by the bot and improves quick injections until success.
BEAM search uses a simple and quick injection requested by GEMINI to send emails to hackers, including confidential information that GEMINI is seeking. “If the AI system recognizes the request as suspicious and does not obey, the attack will add a random token at the end of the quick injection and measure the new probability that the attack succeeds.” , Collect random tokens and add them until they succeed.