Most active commenters
  • ijk(3)

←back to thread

MCP in LM Studio

(lmstudio.ai)
226 points yags | 18 comments | | HN request time: 1.597s | source | bottom
1. minimaxir ◴[] No.44380112[source]
LM Studio has quickly become the best way to run local LLMs on an Apple Silicon Mac: no offense to vllm/ollama and other terminal-based approaches, but LLMs have many levers for tweaking output and sometimes you need a UI to manage it. Now that LM Studio supports MLX models, it's one of the most efficient too.

I'm not bullish on MCP, but at the least this approach gives a good way to experiment with it for free.

replies(4): >>44380220 #>>44380533 #>>44380699 #>>44381188 #
2. nix0n ◴[] No.44380220[source]
LM Studio is quite good on Windows with Nvidia RTX also.
replies(1): >>44383574 #
3. pzo ◴[] No.44380533[source]
I just wish they did some facelifting of UI. Right now is too colorfull for me and many different shades of similar colors. I wish they copy some color pallet from google ai studio or from trae or pycharm.
4. chisleu ◴[] No.44380699[source]
> I'm not bullish on MCP

You gotta help me out. What do you see holding it back?

replies(1): >>44381024 #
5. minimaxir ◴[] No.44381024[source]
tl;dr the current hype around it is a solution looking for a problem and at a high level, it's just a rebrand of the Tools paradigm.
replies(1): >>44381099 #
6. mhast ◴[] No.44381099{3}[source]
It's "Tools as a service", so it's really trying to make tool calling easier to use.
replies(1): >>44382200 #
7. zackify ◴[] No.44381188[source]
Ollama doesn’t even have a way to customize the context size per model and persist it. LM studio does :)
replies(1): >>44382206 #
8. ijk ◴[] No.44382200{4}[source]
Near as I can tell it's supposed to make calling other people's tools easier. But I don't want to spin up an entire server to invoke a calculator. So far it seems to make building my own local tools harder, unless there's some guidebook I'm missing.
replies(2): >>44382667 #>>44384088 #
9. Anaphylaxis ◴[] No.44382206[source]
This isn't true. You can `ollama run {model}`, `/set parameter num_ctx {ctx}` and then `/save`. Recommended to `/save {model}:{ctx}` to persist on model update
replies(2): >>44385978 #>>44386362 #
10. xyc ◴[] No.44382667{5}[source]
It's a protocol that doesn't dictate how you are calling the tool. You can use in-memory transport without needing to spin up a server. Your tool can just be a function, but with the flexibility of serving to other clients.
replies(1): >>44385236 #
11. boredemployee ◴[] No.44383574[source]
care to elaborate? i have rtx 4070 12gb vram + 64gb ram, i wonder what models I can run with it. Anything useful?
replies(1): >>44388014 #
12. cchance ◴[] No.44384088{5}[source]
Your not spinning up a whole server lol, most MCP's can be run locally, and talked to over stdio, like their just apps that the LLM can call, what they talk to or do is up to the MCP writer, its easier to have a MCP that communicates what it can do and handles the back and forth, than writing a non-standard middleware to handle say calls to an API or handle using applescript, or vmware or something else...
replies(1): >>44385222 #
13. ijk ◴[] No.44385222{6}[source]
I wish the documentation was clearer on that point; I went looking through their site and didn't see any examples that weren't oversimplified REST API calls. I imagine they might have updated it since then, or I missed something.
14. ijk ◴[] No.44385236{6}[source]
Are there any examples of that? All the documentation I saw seemed to be about building an MCP server, with very little about connecting an existing inference infrastructure to local functions.
15. truemotive ◴[] No.44385978{3}[source]
This can be done with custom Modelfiles as well, I was pretty bent when I found out that 2048 was the default context length.

https://ollama.readthedocs.io/en/modelfile/

16. zackify ◴[] No.44386362{3}[source]
As of 2 weeks back if I did this, it would reset back the moment cline made an api call. But lm studio would work correctly. I’ll have to try again. Even confirmed cline was not overriding num context
17. nix0n ◴[] No.44388014{3}[source]
LM Studio's model search is pretty good at showing what models will fit in your VRAM.

For my 16gb of VRAM, those models do not include anything that's good at coding, even when I provide the API documents via PDF upload (another thing that LM Studio makes easy).

So, not really, but LM Studio at least makes it easier to find that out.

replies(1): >>44389969 #
18. boredemployee ◴[] No.44389969{4}[source]
ok, ty for the reply!