MCP in LM Studio | slacker news

1. minimaxir ◴[25 Jun 25 18:00 UTC] No.44380112[source]▶

LM Studio has quickly become the best way to run local LLMs on an Apple Silicon Mac: no offense to vllm/ollama and other terminal-based approaches, but LLMs have many levers for tweaking output and sometimes you need a UI to manage it. Now that LM Studio supports MLX models, it's one of the most efficient too.

I'm not bullish on MCP, but at the least this approach gives a good way to experiment with it for free.

replies(4): >>44380220 #>>44380533 #>>44380699 #>>44381188 #

2. nix0n ◴[25 Jun 25 18:10 UTC] No.44380220[source]▶

>>44380112 (TP) #

LM Studio is quite good on Windows with Nvidia RTX also.

replies(1): >>44383574 #

3. pzo ◴[25 Jun 25 18:38 UTC] No.44380533[source]▶

>>44380112 (TP) #

I just wish they did some facelifting of UI. Right now is too colorfull for me and many different shades of similar colors. I wish they copy some color pallet from google ai studio or from trae or pycharm.

4. chisleu ◴[25 Jun 25 18:53 UTC] No.44380699[source]▶

>>44380112 (TP) #

> I'm not bullish on MCP

You gotta help me out. What do you see holding it back?

replies(1): >>44381024 #

5. minimaxir ◴[25 Jun 25 19:29 UTC] No.44381024[source]▶

>>44380699 #

tl;dr the current hype around it is a solution looking for a problem and at a high level, it's just a rebrand of the Tools paradigm.

replies(1): >>44381099 #

6. mhast ◴[25 Jun 25 19:37 UTC] No.44381099{3}[source]▶

>>44381024 #

It's "Tools as a service", so it's really trying to make tool calling easier to use.

replies(1): >>44382200 #

7. zackify ◴[25 Jun 25 19:45 UTC] No.44381188[source]▶

>>44380112 (TP) #

Ollama doesn’t even have a way to customize the context size per model and persist it. LM studio does :)

replies(1): >>44382206 #

8. ijk ◴[25 Jun 25 22:01 UTC] No.44382200{4}[source]▶

>>44381099 #

Near as I can tell it's supposed to make calling other people's tools easier. But I don't want to spin up an entire server to invoke a calculator. So far it seems to make building my own local tools harder, unless there's some guidebook I'm missing.

replies(2): >>44382667 #>>44384088 #

9. Anaphylaxis ◴[25 Jun 25 22:02 UTC] No.44382206[source]▶

>>44381188 #

This isn't true. You can `ollama run {model}`, `/set parameter num_ctx {ctx}` and then `/save`. Recommended to `/save {model}:{ctx}` to persist on model update

replies(2): >>44385978 #>>44386362 #

10. xyc ◴[25 Jun 25 23:13 UTC] No.44382667{5}[source]▶

>>44382200 #

It's a protocol that doesn't dictate how you are calling the tool. You can use in-memory transport without needing to spin up a server. Your tool can just be a function, but with the flexibility of serving to other clients.

replies(1): >>44385236 #

11. boredemployee ◴[26 Jun 25 01:58 UTC] No.44383574[source]▶

>>44380220 #

care to elaborate? i have rtx 4070 12gb vram + 64gb ram, i wonder what models I can run with it. Anything useful?

replies(1): >>44388014 #

12. cchance ◴[26 Jun 25 04:02 UTC] No.44384088{5}[source]▶

>>44382200 #

Your not spinning up a whole server lol, most MCP's can be run locally, and talked to over stdio, like their just apps that the LLM can call, what they talk to or do is up to the MCP writer, its easier to have a MCP that communicates what it can do and handles the back and forth, than writing a non-standard middleware to handle say calls to an API or handle using applescript, or vmware or something else...

replies(1): >>44385222 #

13. ijk ◴[26 Jun 25 08:01 UTC] No.44385222{6}[source]▶

>>44384088 #

I wish the documentation was clearer on that point; I went looking through their site and didn't see any examples that weren't oversimplified REST API calls. I imagine they might have updated it since then, or I missed something.

14. ijk ◴[26 Jun 25 08:03 UTC] No.44385236{6}[source]▶

>>44382667 #

Are there any examples of that? All the documentation I saw seemed to be about building an MCP server, with very little about connecting an existing inference infrastructure to local functions.

15. truemotive ◴[26 Jun 25 10:28 UTC] No.44385978{3}[source]▶

>>44382206 #

This can be done with custom Modelfiles as well, I was pretty bent when I found out that 2048 was the default context length.

https://ollama.readthedocs.io/en/modelfile/

16. zackify ◴[26 Jun 25 11:33 UTC] No.44386362{3}[source]▶

>>44382206 #

As of 2 weeks back if I did this, it would reset back the moment cline made an api call. But lm studio would work correctly. I’ll have to try again. Even confirmed cline was not overriding num context

17. nix0n ◴[26 Jun 25 14:50 UTC] No.44388014{3}[source]▶

>>44383574 #

LM Studio's model search is pretty good at showing what models will fit in your VRAM.

For my 16gb of VRAM, those models do not include anything that's good at coding, even when I provide the API documents via PDF upload (another thing that LM Studio makes easy).