←back to thread

134 points nick_wolf | 2 comments | | HN request time: 0.549s | source

I noticed the growing security concerns around MCP (https://news.ycombinator.com/item?id=43600192) and built an open source tool that can detect several patterns of tool poisoning attacks, exfiltration channels and cross-origin manipulations.

MCP-Shield scans your installed servers (Cursor, Claude Desktop, etc.) and shows what each tool is trying to do at the instruction level, beyond just the API surface. It catches hidden instructions that try to read sensitive files, shadow other tools' behavior, or exfiltrate data.

Example of what it detects:

- Hidden instructions attempting to access ~/.ssh/id_rsa

- Cross-origin manipulations between server that can redirect WhatsApp messages

- Tool shadowing that overrides behavior of other MCP tools

- Potential exfiltration channels through optional parameters

I've included clear examples of detection outputs in the README and multiple example vulnerabilities in the repo so you can see the kinds of things it catches.

This is an early version, but I'd appreciate feedback from the community, especially around detection patterns and false positives.

1. bosky101 ◴[] No.43690120[source]
I'd like to remind you that tools is a json array to any modern llm inference api. That rather than returning text, tells you which function to call.

I'm all for abstraction of a level of indirection. But this is pushing things too far.

We now have an entire ecosystem, layers of unneeded engineering, cohorts of talent and capital going to create man in the middle servers that forces us to get this array from around the world + maintain a server with several gb of deps to get a json array that you should't trust.

2) It makes sense if every server has a tools.txt equivalent of their own swagger. Eg i would trust google photos to maintain and document their tools rather than the 10,000 MCP servers possibly alive for no reason and already out of date by the time you are done reading this comment. In addition to being over engineered, to trust a random server as a proxy never made any sense.

3) nobody wants to run servers. Can't find this meme, but found it here on HN several times.

Sorry but I would rather not wait a year for this industry to crash and burn and take down genai apps galore or worse, start leaking this data and your bills.

Kudos to document any security gaps though.

replies(1): >>43690827 #
2. ◴[] No.43690827[source]