Emerging Threats: AI-Powered Scams Targeting Gmail Users
A growing concern in the cybersecurity landscape involves a new type of email attack that is silently targeting 1.8 billion Gmail users. This threat leverages Google Gemini, an AI tool integrated into Gmail and Workspace, to deceive users into revealing their account credentials. Cybersecurity experts have identified this as a significant risk, with hackers crafting emails that contain hidden instructions designed to manipulate Gemini into generating fake phishing warnings.
These malicious emails are often crafted to appear urgent or from a legitimate business source. Attackers use techniques such as setting the font size to zero and text color to white, which makes the hidden prompts invisible to users but actionable by Gemini. This method allows hackers to insert deceptive messages that can trick users into sharing their passwords or visiting malicious websites.
How the Attack Works
One example of this attack was demonstrated by Marco Figueroa, GenAI bounty manager, who showed how a malicious prompt could falsely alert users that their email account has been compromised. The prompt would urge users to call a fake “Google support” phone number provided in the email to resolve the issue. This kind of manipulation is known as “indirect prompt injection,” where the AI cannot distinguish between a user’s query and a hacker’s hidden message.
According to IBM, AI systems like Gemini cannot differentiate between legitimate and malicious text, leading them to follow whichever comes first, even if it’s harmful. Security firms like Hidden Layer have highlighted how attackers can create seemingly normal-looking messages filled with hidden codes and URLs designed to fool AI systems.
Real-World Examples
In one case, hackers sent an email that appeared to be a calendar invite. However, within the email, hidden commands instructed Gemini to warn the user about a fake password breach, prompting them to click on a malicious link. This technique has been uncovered through research led by Mozilla’s 0Din security team, which demonstrated how Gemini could be manipulated to display a fake security alert.
The attack works by embedding the prompt in white text that blends into the email background. When someone clicks ‘summarize this email,’ Gemini processes the hidden message, not just the visible text. This means that even if the email appears harmless, the AI could be tricked into performing actions that the user never intended.
Google’s Response and Ongoing Concerns
Google has acknowledged that this type of attack has been a problem since 2024 and claims to have added new safety tools to prevent it. However, the trick still appears to be effective. In some cases, security flaws reported to Google have shown how attackers can hide fake instructions inside emails that trick Gemini into doing things users never asked for.
Instead of addressing the issue, Google marked the report as “won’t fix,” suggesting that they believe Gemini is functioning as intended. This decision has raised concerns among security experts, as it implies that Google does not see the behavior of ignoring hidden instructions as a flaw, leaving the door open for hackers.
Experts warn that if AI cannot distinguish between real and hidden attacks, and Google does not address the issue, the risk remains active. With AI becoming more popular for quick decisions and email summarization, the potential for misuse increases.
Broader Implications
This threat is not limited to Gmail alone. As AI becomes integrated into other platforms such as Google Docs, Calendar, and external apps, the risk spreads. Some attacks are even being created and executed by other AI systems, not just human hackers.
Google has reminded users that it does not issue security alerts through Gemini summaries. If a summary indicates that a password is at risk or provides a link to click, users should treat it as suspicious and delete the email.
In a recent blog post, Google mentioned that Gemini now asks for confirmation before performing any risky actions, such as sending an email or deleting something. This additional step gives users a chance to stop the action, even if the AI was tricked. Google also displays a yellow banner if it detects and blocks an attack. If the system finds a suspicious link in a summary, it removes it and replaces it with a safety alert. However, some problems still remain unresolved.