←back to thread

134 points nick_wolf | 3 comments | | HN request time: 0.643s | source

I noticed the growing security concerns around MCP (https://news.ycombinator.com/item?id=43600192) and built an open source tool that can detect several patterns of tool poisoning attacks, exfiltration channels and cross-origin manipulations.

MCP-Shield scans your installed servers (Cursor, Claude Desktop, etc.) and shows what each tool is trying to do at the instruction level, beyond just the API surface. It catches hidden instructions that try to read sensitive files, shadow other tools' behavior, or exfiltrate data.

Example of what it detects:

- Hidden instructions attempting to access ~/.ssh/id_rsa

- Cross-origin manipulations between server that can redirect WhatsApp messages

- Tool shadowing that overrides behavior of other MCP tools

- Potential exfiltration channels through optional parameters

I've included clear examples of detection outputs in the README and multiple example vulnerabilities in the repo so you can see the kinds of things it catches.

This is an early version, but I'd appreciate feedback from the community, especially around detection patterns and false positives.

1. paulgb ◴[] No.43689354[source]
Neat, but what’s to stop a server from reporting one innocuous set of tools to MCP-Shield and then a different set of tools to the client?
replies(1): >>43689507 #
2. nick_wolf ◴[] No.43689507[source]
Great point, thanks for raising it. You're spot on – the client currently sends name: 'mcp-shield', enabling exactly the bait-and-switch scenario you described.

I'll push an update in ~30 mins adding an optional --identify-as <client-name> flag. This will let folks test for that kind of evasion by mimicking specific clients, while keeping the default behavior consistent. Probably will think more about other possible vectors. Really appreciate the feedback!

replies(1): >>43689570 #
3. nick_wolf ◴[] No.43689570[source]
That was faster than expected - here's the merged commit implementing the --identify-as flag: https://github.com/riseandignite/mcp-shield/commit/e7e2a6c04.... Thanks again!