It is like SQL injection. Probably worse. If you are using unsupervised data for context that ultimately generates executable code you will have this security problem. Duh.
replies(1):
The key thing (really, the only thing) about parameterized queries is that they allow you to provide code and data with a hard separation between the two.
LLMs don't have anything of the sort. They only take in one kind of thing. They don't even have a notion of code versus data that you could separate, or fail to separate. All you can do is either tolerate it sometimes taking instructions from the stuff you want treated as "data," or never give it anything you consider "data." You propose this second one. But never giving it "data" is very different from a feature that allows you to provide arbitrary data with total safety.