←back to thread

326 points threeturn | 2 comments | | HN request time: 0.4s | source

Dear Hackers, I’m interested in your real-world workflows for using open-source LLMs and open-source coding assistants on your laptop (not just cloud/enterprise SaaS). Specifically:

Which model(s) are you running (e.g., Ollama, LM Studio, or others) and which open-source coding assistant/integration (for example, a VS Code plugin) you’re using?

What laptop hardware do you have (CPU, GPU/NPU, memory, whether discrete GPU or integrated, OS) and how it performs for your workflow?

What kinds of tasks you use it for (code completion, refactoring, debugging, code review) and how reliable it is (what works well / where it falls short).

I'm conducting my own investigation, which I will be happy to share as well when over.

Thanks! Andrea.

Show context
kabes ◴[] No.45776731[source]
Let's say I have a server with an h200 gpu at home. What's the best open model for coding I can run on it today? And is it somewhat competitive with commercial models like sonnet 4.5?
replies(3): >>45776946 #>>45777002 #>>45777030 #
suprjami ◴[] No.45777030[source]
If you have ~$25k to buy a H200 then don't buy one. Rent them out much cheaper and keep renting newer models when your H200 becomes an outdated paperweight.

Assuming you ran inference for the full working day, you'd need to run your H200 for almost 2 years to break even. Realistically you don't run inference full time so you'll never realise the value of the card before it's obsolete.

replies(1): >>45779674 #
1. kabes ◴[] No.45779674[source]
The company I work for is in the defense industry and by contract can't send any code outside their own datacenter. So cloud-rented H200's are a no-go and obviously commercial LLM's as well. so breaking even is not the goal here.
replies(1): >>45785594 #
2. suprjami ◴[] No.45785594[source]
In that case I suggest you buy cheaper desktop cards instead of a H200. Two or three 5090s will let you run decent models at very good speed.