[1] Camel: work by google deepmind on how to (provably!) prevent agent planner from being prompt-injected: https://github.com/google-research/camel-prompt-injection
[2] FIDES: similar idea by Microsoft, formal guarantees: https://github.com/microsoft/fides
[3] ASIDE: marking non-executable parts of input and rotating their embedding by 90 degrees to defend against prompt injections: https://github.com/egozverev/aside
[4] CachePrune: pruning attention matrices to remove "instruction activations" on prompt injections: https://arxiv.org/abs/2504.21228
[5] Embedding permission tokens and inserting them to prompts: https://arxiv.org/abs/2503.23250
Here's (our own) paper discussing why prompt based methods are not going to work to solve the issue: "Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?" https://arxiv.org/abs/2403.06833
Do not rely on prompt engineering defenses!