Show HN: MCP-Shield – Detect security issues in MCP servers

1. Manfred ◴[15 Apr 25 07:57 UTC] No.43690137[source]▶

People have been struggling with securing against SQL injection attacks for decades, and SQL has explicit rules for quoting values. I don't have a lot of faith in finding a solution that safely includes user input into a prompt, but I would love to be proven wrong.

replies(3): >>43690202 #>>43690816 #>>43693274 #

2. jason-phillips ◴[15 Apr 25 08:09 UTC] No.43690202[source]▶

>>43690137 (TP) #

> People have been struggling with securing against SQL injection attacks for decades.

Parameterized queries.

A decades old struggle is now lifted from you. Go in peace, my son.

replies(2): >>43690359 #>>43691397 #

3. ololobus ◴[15 Apr 25 08:33 UTC] No.43690359[source]▶

>>43690202 #

> Parameterized queries.

Also happy to be wrong, but in Postges clients, parametrized queries are usually implemented via prepared statements, which do not work with DDL on the protocol level. This means that if you want to create a role or table which name is a user input, you have a bad time. At least I wasn’t able to find a way to escape DDL parameters with rust-postgres, for example.

And because this seems to be a protocol limitation, I guess the clients that do implement it, do it in some custom way on the client side.

replies(1): >>43690456 #

4. jason-phillips ◴[15 Apr 25 08:47 UTC] No.43690456{3}[source]▶

>>43690359 #

Just because you can, doesn't mean you should. But if you must, abstract for good time.

5. simonw ◴[15 Apr 25 09:50 UTC] No.43690816[source]▶

>>43690137 (TP) #

I've been following prompt injection for 2.5 years and until last week I hadn't seen any convincing mitigations for it - the proposed solutions were almost all optimistic versions of "if we train a good enough model it won't get tricked any more", which doesn't work.

What changed is the new CaMeL paper from DeepMind, which notably does not rely on AI models to detect attacks: https://arxiv.org/abs/2503.18813

I wrote my own notes on that paper here: https://simonwillison.net/2025/Apr/11/camel/

replies(1): >>43696683 #

6. pjmlp ◴[15 Apr 25 11:42 UTC] No.43691397[source]▶

>>43690202 #

Just like we know how to make C safe (in theory), and many other cases in the industry.

The problem is that solutions don't exist, rather the lack of safety culture that keeps ignoring best practices unless they are imposed by regulations.

replies(1): >>43692436 #

7. chrisweekly ◴[15 Apr 25 13:31 UTC] No.43692436{3}[source]▶

>>43691397 #

"problem is that solutions don't exist"

you meant "problem ISN'T that solutions...", right?

replies(1): >>43692807 #

8. pjmlp ◴[15 Apr 25 14:01 UTC] No.43692807{4}[source]▶

>>43692436 #

Correct, typo. Thanks.

9. Mountain_Skies ◴[15 Apr 25 14:29 UTC] No.43693274[source]▶

>>43690137 (TP) #

One of the most astonishing things about working in Application Security was seeing how many SQL injection vulns there were in new code. Often doing things the right way was easier than doing it the wrong way, and yet some would fight against their data framework to create the injection vulnerability. Doubt they were trying to intentionally cause security vulnerabilities but rather were either using old tutorials and copy/paste code or were long term coders who had been doing it this way for decades.

10. nrvn ◴[15 Apr 25 18:28 UTC] No.43696683[source]▶

>>43690816 #

I can't "shake off" the feeling that this whole MCP/LLM thing is moving in the wrong if not the opposite direction. Up until recently we have been dealing with (or striving to build) deterministic systems in the sense that the output of such systems is expected to be the same given the same input. LLMs with all respect to them behave on a completely opposite premise. There is zero guarantee a given LLM will respond with the same output to the same exact "prompt". Which is OK because that's how natural human languages work and LLMs are perfectly trained to mimic human language.

But now we have to contain all the relevant emerging threats via teaching the LLM to translate user queries from natural language to some intermediate structured yet non-deterministic representation(subset of Python in the case of CaMeL), and validate the generated code using the conventional methods (deterministic systems, i.e. CaMeL interpreter) against pre-defined policies. Which is fine on paper but every new component (Q-LLM, interpreter, policies, policy engine) will have its own bouquet of threat vectors to be assessed and addressed.

The idea of some "magic" system translating natural language query into series of commands is nice. But this is one of those moments I am afraid I would prefer a "faster horse" especially for the likes of sending emails and organizing my music collection...