←back to thread

645 points helloplanets | 6 comments | | HN request time: 0.001s | source | bottom
Show context
ec109685 ◴[] No.45005397[source]
It’s obviously fundamentally unsafe when Google, OpenAI and Anthropic haven’t released the same feature and instead use a locked down VM with no cookies to browse the web.

LLM within a browser that can view data across tabs is the ultimate “lethal trifecta”.

Earlier discussion: https://news.ycombinator.com/item?id=44847933

It’s interesting that in Brave’s post describing this exploit, they didn’t reach the fundamental conclusion this is a bad idea: https://brave.com/blog/comet-prompt-injection/

Instead they believe model alignment, trying to understand when a user is doing a dangerous task, etc. will be enough. The only good mitigation they mention is that the agent should drop privileges, but it’s just as easy to hit an attacker controlled image url to leak data as it is to send an email.

replies(7): >>45005444 #>>45005853 #>>45006130 #>>45006210 #>>45006263 #>>45006384 #>>45006571 #
1. ryanjshaw ◴[] No.45006210[source]
Maybe the article was updated but right now it says “The browser should isolate agentic browsing from regular browsing”
replies(3): >>45006335 #>>45006361 #>>45009220 #
2. skaul ◴[] No.45006335[source]
That was in the blog from the starting, and it's also the most important mitigation we identified immediately when starting to think about building agentic AI into the browser. Isolating agentic browsing while still enabling important use-cases (which is why users want to use agentic browsing in the first place) is the hard part, which is presumably why many browsers are just rolling out agentic capabilities in regular browsing.
replies(1): >>45010144 #
3. ec109685 ◴[] No.45006361[source]
That was my point about dropping privileges. It can still be exploited if the summary contains a link to an image that the attacker can control via text on the page that the LLM sees. It’s just a lot of Swiss cheese.

That said, it’s definitely the best approach listed. And turns that exploit into an XSS attack on reddit.com, which is still bad.

4. mapontosevenths ◴[] No.45009220[source]
Tabs in general should be security boundaries. Anything else should propmt for permission.
5. waterproof ◴[] No.45010144[source]
Isn't there a situation where the agentic browser, acting correctly on behalf of the user, needs to send Bitcoin or buy plane tickets? Isn't that flexibility kind of the whole point of the system? If so, I don't see what you get by distinguishing between agentic and no agentic browsing.

Bad actors will now be working to scam users' LLMs rather than the users themselves. You can use more LLMs to monitor the LLMs and try and protect them, but it's turtles all the way down.

The difference: when someone loses their $$$, they're not a fool for falling for some Nigerian Prince wire scam themselves, they're just a fool for using your browser.

Or am I missing something?

replies(1): >>45015291 #
6. skaul ◴[] No.45015291{3}[source]
You're right that if the user logs into a sensitive website, the "isolated browsing" mitigation stops helping. We don't want the user to accidentally end up in that state though. Separately, I can also imagine use-cases for agentic browsing where the user doesn't have to be logged into sensitive websites. Summarizing Hacker News front page, for one.