←back to thread

645 points helloplanets | 10 comments | | HN request time: 0.208s | source | bottom
Show context
ec109685 ◴[] No.45005397[source]
It’s obviously fundamentally unsafe when Google, OpenAI and Anthropic haven’t released the same feature and instead use a locked down VM with no cookies to browse the web.

LLM within a browser that can view data across tabs is the ultimate “lethal trifecta”.

Earlier discussion: https://news.ycombinator.com/item?id=44847933

It’s interesting that in Brave’s post describing this exploit, they didn’t reach the fundamental conclusion this is a bad idea: https://brave.com/blog/comet-prompt-injection/

Instead they believe model alignment, trying to understand when a user is doing a dangerous task, etc. will be enough. The only good mitigation they mention is that the agent should drop privileges, but it’s just as easy to hit an attacker controlled image url to leak data as it is to send an email.

replies(7): >>45005444 #>>45005853 #>>45006130 #>>45006210 #>>45006263 #>>45006384 #>>45006571 #
1. cma ◴[] No.45005444[source]
I think if you let claude code go wild with auto approval something similar could happen, since it can search the web and has the potential for prompt injection in what it reads there. Even without auto approval on reading and modifying files, if you aren't running it in a sandbox it could write code that then modifies your browser files the next time you do something like run your unit tests that it made, if you aren't reviewing every change carefully.
replies(2): >>45005843 #>>45006390 #
2. veganmosfet ◴[] No.45005843[source]
I tried this on Gemini CLI and it worked, just add some magic vibes ;-)
3. darepublic ◴[] No.45006390[source]
I really don't get why you would use a coding agent in yolo mode. I use the llm code gen in chunks at least glancing over it each time I add something. Why the hell would you have an approach of AI take the wheel
replies(4): >>45006510 #>>45006965 #>>45007931 #>>45009256 #
4. ec109685 ◴[] No.45006510[source]
It still keeps you in the loop, but doesn’t ask to run shell commands, etc.
replies(1): >>45033452 #
5. threecheese ◴[] No.45006965[source]
It depends on what you are using it for; I use CC for producing code that’s run elsewhere, but have also found it’s useful for producing code and commands behind day to day sysadmin/maintenance tasks. I don’t actually allow it to YOLO in this case (I have a few brain cells left), but the fact that it’s excellent at using bash suggests there are some terminal-based computer use tasks it could be useful for, or some set of useful tasks that might be considered harmful on your laptop but much less so in a virtual machine or container.
6. cma ◴[] No.45009256[source]
If you are only glancing over it and not doing a detailed review I think you could get hit with a prompt injection in the way I mentioned, with it writing something into the code that then when you run tests or the app ends up doing the action, which could be spinning up another claude code instance with approval off or turning off safety hooks etc.
replies(1): >>45015377 #
7. darepublic ◴[] No.45015377{3}[source]
The prompt injection would come from where? If I am chatting with the llm and directly copy paste where is the injection. It would have to ge a malicious llm response but that is much much less likely than when you scrape third party sites or documents
replies(1): >>45025923 #
8. cma ◴[] No.45025923{4}[source]
The prompt injection would come when Claude code searches the web. What it then slips in the code would get there when you approve the edit without carefully looking at it, it can be in one line that fetches a payload somewhere else. The execution would come when you run the program you are building or its unit tests or even when you do a build if it is slipped into a make file.
9. jameshart ◴[] No.45033452{3}[source]
That seems like a bad default. VSCode’s agent mode requires approval for shell commands every time by default, with a whitelisting capability (which is itself risky, because hiding shell commands in args to an executable is quite doable). Are people running agents under their own user identity without supervising the commands they run?
replies(1): >>45040033 #
10. cma ◴[] No.45040033{4}[source]
The default is ask for approval with option to whitelist certain commands.