←back to thread

780 points rexpository | 2 comments | | HN request time: 0.389s | source
Show context
gregnr ◴[] No.44503146[source]
Supabase engineer here working on MCP. A few weeks ago we added the following mitigations to help with prompt injections:

- Encourage folks to use read-only by default in our docs [1]

- Wrap all SQL responses with prompting that discourages the LLM from following instructions/commands injected within user data [2]

- Write E2E tests to confirm that even less capable LLMs don't fall for the attack [2]

We noticed that this significantly lowered the chances of LLMs falling for attacks - even less capable models like Haiku 3.5. The attacks mentioned in the posts stopped working after this. Despite this, it's important to call out that these are mitigations. Like Simon mentions in his previous posts, prompt injection is generally an unsolved problem, even with added guardrails, and any database or information source with private data is at risk.

Here are some more things we're working on to help:

- Fine-grain permissions at the token level. We want to give folks the ability to choose exactly which Supabase services the LLM will have access to, and at what level (read vs. write)

- More documentation. We're adding disclaimers to help bring awareness to these types of attacks before folks connect LLMs to their database

- More guardrails (e.g. model to detect prompt injection attempts). Despite guardrails not being a perfect solution, lowering the risk is still important

Sadly General Analysis did not follow our responsible disclosure processes [3] or respond to our messages to help work together on this.

[1] https://github.com/supabase-community/supabase-mcp/pull/94

[2] https://github.com/supabase-community/supabase-mcp/pull/96

[3] https://supabase.com/.well-known/security.txt

replies(31): >>44503188 #>>44503200 #>>44503203 #>>44503206 #>>44503255 #>>44503406 #>>44503439 #>>44503466 #>>44503525 #>>44503540 #>>44503724 #>>44503913 #>>44504349 #>>44504374 #>>44504449 #>>44504461 #>>44504478 #>>44504539 #>>44504543 #>>44505310 #>>44505350 #>>44505972 #>>44506053 #>>44506243 #>>44506719 #>>44506804 #>>44507985 #>>44508004 #>>44508124 #>>44508166 #>>44508187 #
1. troupo ◴[] No.44503206[source]
> Wrap all SQL responses with prompting that discourages the LLM from following instructions/commands injected within user data

I think this article of mine will be evergreen and relevant: https://dmitriid.com/prompting-llms-is-not-engineering

> Write E2E tests to confirm that even less capable LLMs don't fall for the attack [2]

> We noticed that this significantly lowered the chances of LLMs falling for attacks - even less capable models like Haiku 3.5.

So, you didn't even mitigate the attacks crafted by your own tests?

> e.g. model to detect prompt injection attempts

Adding one bullshit generator on top another doesn't mitigate bullshit generation

replies(1): >>44503279 #
2. otterley ◴[] No.44503279[source]
> Adding one bullshit generator on top another doesn't mitigate bullshit generation

It's bullshit all the way down. (With apologies to Bertrand Russell)