←back to thread

645 points helloplanets | 1 comments | | HN request time: 0s | source
Show context
ec109685 ◴[] No.45005397[source]
It’s obviously fundamentally unsafe when Google, OpenAI and Anthropic haven’t released the same feature and instead use a locked down VM with no cookies to browse the web.

LLM within a browser that can view data across tabs is the ultimate “lethal trifecta”.

Earlier discussion: https://news.ycombinator.com/item?id=44847933

It’s interesting that in Brave’s post describing this exploit, they didn’t reach the fundamental conclusion this is a bad idea: https://brave.com/blog/comet-prompt-injection/

Instead they believe model alignment, trying to understand when a user is doing a dangerous task, etc. will be enough. The only good mitigation they mention is that the agent should drop privileges, but it’s just as easy to hit an attacker controlled image url to leak data as it is to send an email.

replies(7): >>45005444 #>>45005853 #>>45006130 #>>45006210 #>>45006263 #>>45006384 #>>45006571 #
snet0 ◴[] No.45005853[source]
> Instead they believe model alignment, trying to understand when a user is doing a dangerous task, etc. will be enough.

Maybe I have a fundamental misunderstanding, but I feel like hoping that model alignment and in-model guardrails are statistical preventions, ie you'll reduce the odds to some number of zeroes preceeding the 1. These things should literally never be able to happen, though. It's a fools errand to hope that you'll get to a model where there is no value in the input space that maps to <bad thing you really don't want>. Even if you "stack" models, having a safety-check model act on the output of your larger model, you're still just multiplying odds.

replies(5): >>45006201 #>>45006251 #>>45006358 #>>45007218 #>>45007846 #
zulban ◴[] No.45007218[source]
"These things should literally never be able to happen"

If we consider "humans using a bank website" and apply the same standard, then we'd never have online banking at all. People have brain farts. You should ask yourself if the failure rate is useful, not if it meets a made up perfection that we don't even have with manual human actions.

replies(3): >>45007312 #>>45007425 #>>45007768 #
echelon ◴[] No.45007312[source]
The vast majority of humans would fall to bad security.

I think we should continue experimenting with LLMs and AI. Evolution is littered with the corpses of failed experiments. It would be a shame if we stopped innovating and froze things with the status quo because we were afraid of a few isolated accidents.

We should encourage people that don't understand the risks not to use browsers like this. For those that do understand, they should not use financial tools with these browsers.

Caveat emptor.

Don't stall progress because "eww, AI". Humans are just as gross.

We need to make mistakes to grow.

replies(2): >>45007347 #>>45007786 #
girvo ◴[] No.45007786[source]
When your “mistakes” are “a user has their bank account drained irrecoverably”, no, we don’t.
replies(1): >>45007817 #
echelon ◴[] No.45007817[source]
So let's stop building browser agents?

This is a hypothetical Reddit comment that got Tweeted for attention. The to-date blast radius of this is zero.

What you're looking at now is the appropriate level of concern.

Let people build the hacky pizza ordering automations so we can find the utility sweet spots and then engineer more robust systems.

replies(4): >>45009276 #>>45009412 #>>45012308 #>>45018274 #
1. wat10000 ◴[] No.45009412{4}[source]
1. Access to untrusted data.

2. Access to private data.

3. Ability to communicate externally.

An LLM must not be given all three of these, or it is inherently insecure. Any two is fine (mostly, private data and external communication is still a bit iffy), but if you give them all three then you're screwed. This is inherent to how LLMs work, you can't fix it as the technology stands today.

This isn't a secret. It's well known, and it's also something you can easily derive from first principles if you know the basics of how LLMs work.

You can build browser agents, but you can't give them all three of these things. Since a browser agent inherently accesses untrusted data and communicates externally, that means that it must not be given access to private data. Run it in a separate session with no cookies or other local data from your main session and you're fine. But running it in the user's session with all of their state is just plain irresponsible.