OpenAI has said that some attack methods against AI browsers like ChatGPT Atlas are likely here to stay, raising questions about whether AI agents can ever safely operate across the open web.
The main issue is a type of attack called “prompt injection,” where hackers hide malicious instructions in websites, documents, or emails that can trick the AI agent into doing something harmful. For example, an attacker could embed hidden commands in a webpage—perhaps in text that is invisible to the human eye but looks legitimate to an AI—that override a user’s instructions and tell an agent to share a user’s emails, or drain someone’s bank account.
Following the launch of OpenAI’s ChatGPT Atlas browser in October, several security researchers demonstrated how a few words hidden in a Google Doc or clipboard link could manipulate the AI agent’s behavior. Brave, an open-source browser company that previously disclosed a flaw in Perplexity’s Comet browser, also published research warning that all AI-powered browsers are vulnerable to attacks like indirect prompt injection.
“Prompt injection, much like scams and social engineering on the web, is unlikely to ever be fully ‘solved,'” OpenAI wrote in a blog post Monday, adding that “agent mode” in ChatGPT Atlas “expands the security threat surface.”
OpenAI ...

1 month ago
14















English (US) ·