The security paradox of local LLMs

(quesma.com)

145 points jakozaur | 4 comments | 22 Oct 25 12:48 UTC | HN request time: 0s | source

Show context

TedDallas ◴[22 Oct 25 15:24 UTC] No.45670559[source]▶

It is like SQL injection. Probably worse. If you are using unsupervised data for context that ultimately generates executable code you will have this security problem. Duh.

replies(1): >>45670618 #

philipwhiuk ◴[22 Oct 25 15:28 UTC] No.45670618[source]▶

>>45670559 #

Worse because there's really no equivalent to prepared statements.

replies(1): >>45671189 #

charcircuit ◴[22 Oct 25 16:03 UTC] No.45671189[source]▶

>>45670618 #

Sure there is. A common way is to have the LLM generate things like {name} which will get substituted for the user's name instead of trying to get the LLM itself to generate the user's name.

replies(1): >>45674025 #

wat10000 ◴[22 Oct 25 19:31 UTC] No.45674025[source]▶

>>45671189 #

Parameterized queries allow you to provide untrusted input to the database in a way that's guaranteed not to be interpreted as instructions.

There's nothing like that for LLMs.

replies(1): >>45674077 #

1. charcircuit ◴[22 Oct 25 19:35 UTC] No.45674077{3}[source]▶

>>45674025 #

That's what I explained. You are trying to do something with an untrusted name and the LLM will not treat the name as instructions because it doesn't see the actual name.

replies(1): >>45674721 #

2. wat10000 ◴[22 Oct 25 20:26 UTC] No.45674721[source]▶

>>45674077 (TP) #

You mentioned having the LLM generate a placeholder, whereas the important thing is what it accepts. You can feed an LLM nothing but placeholders but that's very limited since it can't see the the actual data in any way. You're really just having it emit a template. Something simple like "make a calendar event for the reservation in this email" could not be done. In contrast, parameterized queries let the database actually operate on the data.

replies(1): >>45675042 #

3. charcircuit ◴[22 Oct 25 20:56 UTC] No.45675042[source]▶

>>45674721 #

It may be limited but that doesn't mean it's not similar. For example MySQL can't check the weather when given city string as a paramertized query, but that doesn't mean MySQL doesn't have parameterized queries.

replies(1): >>45675128 #

4. wat10000 ◴[22 Oct 25 21:05 UTC] No.45675128{3}[source]▶

>>45675042 #

Querying external information is a different category of thing altogether.

The key thing (really, the only thing) about parameterized queries is that they allow you to provide code and data with a hard separation between the two.

LLMs don't have anything of the sort. They only take in one kind of thing. They don't even have a notion of code versus data that you could separate, or fail to separate. All you can do is either tolerate it sometimes taking instructions from the stuff you want treated as "data," or never give it anything you consider "data." You propose this second one. But never giving it "data" is very different from a feature that allows you to provide arbitrary data with total safety.

↑