←back to thread

600 points antirez | 4 comments | | HN request time: 0.412s | source
Show context
dakiol ◴[] No.44625484[source]
> Gemini 2.5 PRO | Claude Opus 4

Whether it's vibe coding, agentic coding, or copy pasting from the web interface to your editor, it's still sad to see the normalization of private (i.e., paid) LLM models. I like the progress that LLMs introduce and I see them as a powerful tool, but I cannot understand how programmers (whether complete nobodies or popular figures) dont mind adding a strong dependency on a third party in order to keep programming. Programming used to be (and still is, to a large extent) an activity that can be done with open and free tools. I am afraid that in a few years, that will no longer be possible (as in most programmers will be so tied to a paid LLM, that not using them would be like not using an IDE or vim nowadays), since everyone is using private LLMs. The excuse "but you earn six figures, what' $200/month to you?" doesn't really capture the issue here.

replies(46): >>44625521 #>>44625545 #>>44625564 #>>44625827 #>>44625858 #>>44625864 #>>44625902 #>>44625949 #>>44626014 #>>44626067 #>>44626198 #>>44626312 #>>44626378 #>>44626479 #>>44626511 #>>44626543 #>>44626556 #>>44626981 #>>44627197 #>>44627415 #>>44627574 #>>44627684 #>>44627879 #>>44628044 #>>44628982 #>>44629019 #>>44629132 #>>44629916 #>>44630173 #>>44630178 #>>44630270 #>>44630351 #>>44630576 #>>44630808 #>>44630939 #>>44631290 #>>44632110 #>>44632489 #>>44632790 #>>44632809 #>>44633267 #>>44633559 #>>44633756 #>>44634841 #>>44635028 #>>44636374 #
simonw ◴[] No.44626556[source]
The models I can run locally aren't as good yet, and are way more expensive to operate.

Once it becomes economical to run a Claude 4 class model locally you'll see a lot more people doing that.

The closest you can get right now might be Kimi K2 on a pair of 512GB Mac Studios, at a cost of about $20,000.

replies(12): >>44627184 #>>44627617 #>>44627695 #>>44627852 #>>44628143 #>>44631034 #>>44631098 #>>44631352 #>>44631995 #>>44632684 #>>44633226 #>>44644288 #
1. jmb99 ◴[] No.44631352[source]
What’s your budget and speed requirement? A quad-CPU Xeon E7 v4 server (Supermicro X10QBI, for example) with 1TB of RAM gives you ~340GB/s memory bandwidth and enough actual memory to host a full DeepSeek instance, but it will be relatively slow (a few tokens/s max in my experience). Up front cost a bit under $1k, less if you can source cheap 32GB DDR3 RAM. Power consumption is relatively high, ~1kW under load. But I don’t think you can self host a large model cheaper than that.

(If you need even more memory you could equip one of those servers with 6TB of DDR3 but you’ll lose a bit of bandwidth if you go over 2TB. DDR4 is also a slightly faster option but you’re spending 4x as much for the same capacity.)

replies(2): >>44631449 #>>44644474 #
2. pmarreck ◴[] No.44631449[source]
This would require massively more power than the Mac Studios.
replies(1): >>44637427 #
3. jmb99 ◴[] No.44637427[source]
Yep, ~1kW as mentioned. Depending on your electrical rate, break even might be years down the line. And obviously the Mac Studios would perform substantially better.

Edit: And also, to get even half as much memory, you need to spend $10k. If you want to host actually-large LLMs (not quantized/distilled versions), you'll need to spend close to that much. Maybe you can get away with 256GB for now, but that won't even host full Deepseek now (and I don't know if 512GB either, with OS/etc overhead, and a large context window).

4. theshrike79 ◴[] No.44644474[source]
I think we're at early 2000's bitcoin markets here.

People were buying stores empty of GPUs to mine for BTC.

Then people built custom ASICs that couldn't do anything but mine BTC, but did it a lot cheaper and with a lot less electricity required -> nobody GPU mines anymore pretty much.

I'm waiting for a similar thing to happen to local AI.