What if you don't need MCP at all?

(mariozechner.at)

Show context

whoknowsidont ◴[16 Nov 25 21:37 UTC] No.45948637[source]▶

MCP was a really shitty attempt at building a plugin framework that was vague enough to lure people into and then allow other companies to build plugin platforms to take care of the MCP non-sense.

"What is MCP, what does it bring to the table? Who knows. What does it do? The LLM stuff! Pay us $10 a month thanks!"

LLM's have function / tool calling built into them. No major models have any direct knowledge of MCP.

Not only do you not need MCP, but you should actively avoid using it.

Stick with tried and proven API standards that are actually observable and secure and let your models/agents directly interact with those API endpoints.

replies(8): >>45948748 #>>45949815 #>>45950303 #>>45950716 #>>45950817 #>>45951274 #>>45951510 #>>45951951 #

paulddraper ◴[17 Nov 25 02:15 UTC] No.45950303[source]▶

>>45948637 #

> No major models have any direct knowledge of MCP.

Claude and ChatGPT both support MCP, as does the OpenAI Agents SDK.

(If you mean the LLM itself, it is "known" at least as much as any other protocol. For whatever that means.)

replies(1): >>45950488 #

whoknowsidont ◴[17 Nov 25 02:57 UTC] No.45950488[source]▶

>>45950303 #

>it is "known" at least as much as any other protocol.

No. It is not. Please understand what the LLM's are doing. Claude nor ChatGPT nor any major model knows what MCP is.

They know how to function & tool call. They have zero trained data on MCP.

That is a factual statement, not an opinion.

replies(6): >>45950540 #>>45950541 #>>45950569 #>>45950763 #>>45950803 #>>45951338 #

1. choilive ◴[17 Nov 25 03:07 UTC] No.45950540[source]▶

>>45950488 #

That is an easily falsifiable statement. If I ask ChatGPT or Claude what MCP is Model Context Protocol comes up, and furthermore it can clearly explain what MCP does. That seems unlikely to be a coincidental hallucination.

replies(2): >>45950578 #>>45957524 #

2. whoknowsidont ◴[17 Nov 25 03:16 UTC] No.45950578[source]▶

>>45950540 (TP) #

Training data =/= web search

Both ChatGPT and Claude will perform web searches when you ask them a question, which the fact that you got this confused is ironically topical.

But you're still misunderstanding the principle point because at some point these models will undoubtedly have access to that data and be trained on it.

But they didn't need to be, because LLM function & tool calling is already trained on these models and MCP does not augment this functionality in any way.

replies(2): >>45950678 #>>46003517 #

3. davidcbc ◴[17 Nov 25 03:43 UTC] No.45950678[source]▶

>>45950578 #

Claude gives me a lengthy explanation of MCP with web search disabled

replies(1): >>45950731 #

4. whoknowsidont ◴[17 Nov 25 03:57 UTC] No.45950731{3}[source]▶

>>45950678 #

Great! It's still irrelevant.

5. cstrahan ◴[17 Nov 25 19:54 UTC] No.45957524[source]▶

>>45950540 (TP) #

You're misinterpreting OP.

OP is saying that the models have not been trained on particular MCP use, which is why MCP servers serve up tool descriptions, which are fed to the LLM just like any other text -- that is, these descriptions consume tokens and take up precious context.

Here's a representative example, taken from a real world need I had a week ago. I want to port a code base from one language to another (ReasonML to TypeScript, for various reasons). I figure the best way to go about this would be to topologically sort the files by their dependencies, so I can start with porting files with absolutely zero imports, then port files where the only dependencies are on files I've already ported, and so on. Let's suppose I want to use Claude Code to help with this, just to make the choice of agent concrete.

How should I go about this?

The overhead of the MCP approach would be analogous to trying to cram all of the relevant files into the context, and asking Claude to sort them. Even if the context window is sufficient, that doesn't matter because I don't want Claude to "try its best" to give me the topological sort straight from its nondeterministic LLM "head".

So what did I do?

I gave it enough information about how to consult build metadata files to derive the dependency graph, and then had it write a Python script. The LLM is already trained on a massive corpus of Python code, so there's no need to spoon feed it "here's such and such standard library function", or "here's the basic Python syntax", etc -- it already "knows" that. No MCP tool descriptions required.

And then Claude code spits out a script that, yes, I could have written myself, but it does it in maybe 1 minute total of my time. I can skim the script and make sure that it does exactly what it should be doing. Given that this is code, and not nondeterministic wishy washy LLM "reasoning", I know that the result is both deterministic and correct. The total token usage is tiny.

If you look at what Anthropic and CloudFlare have to say on the matter (see https://www.anthropic.com/engineering/code-execution-with-mc... and https://blog.cloudflare.com/code-mode/), it's basically what I've described, but without explicitly telling the LLM to write a script / reviewing that script.

If you have the LLM write code to interface with the world, it can leverage its training in that code, and the code itself will do what code does (precisely what it was configured to do), and the only tokens consumed will be the final result.

MCP is incredibly wasteful and provides more opportunities for LLMs to make mistakes and/or get confused.

6. judahmeek ◴[21 Nov 25 11:33 UTC] No.46003517[source]▶

>>45950578 #

> But they didn't need to be, because LLM function & tool calling is already trained on these models and MCP does not augment this functionality in any way.

I think you're making a weird semantic argument. How is MCP use not a tool call?

↑