AI browsers may have an incurable vulnerability, warns OpenAI
OpenAI has recently stated in an official blog that AI agents designed to operate web browsers may always be vulnerable to a specific type of attack known as "prompt injection", framing it as a persistent security challenge similar to online scams.
In a detailed update, the company explained that "prompt injection" attacks target AI agents by embedding hidden, malicious instructions into content the agent processes, such as emails or webpages. These instructions can hijack the agent, causing it to ignore a user's commands and instead perform harmful actions like sending sensitive data or making unauthorised transactions.
To counter these threats, OpenAI says it has implemented a rapid response cycle using automated security research. The company trains an AI-based attacker to continuously hunt for new ways to exploit its own browser agent within ChatGPT Atlas. When a successful attack method is discovered, the data is used to update and "adversarially train" the agent's model, a recent example of which has been rolled out to all users.
Despite these efforts, OpenAI cautioned that achieving complete, deterministic security against such attacks is unlikely. The company compared the long-term challenge of prompt injection to the ongoing evolution of scams targeting humans. As a result, it recommends users adopt safety practices like limiting the agent's access to logged-in accounts, carefully reviewing its confirmation requests, and providing specific instructions rather than overly broad tasks.


Comments