←back to thread

600 points antirez | 1 comments | | HN request time: 0.248s | source
Show context
dakiol ◴[] No.44625484[source]
> Gemini 2.5 PRO | Claude Opus 4

Whether it's vibe coding, agentic coding, or copy pasting from the web interface to your editor, it's still sad to see the normalization of private (i.e., paid) LLM models. I like the progress that LLMs introduce and I see them as a powerful tool, but I cannot understand how programmers (whether complete nobodies or popular figures) dont mind adding a strong dependency on a third party in order to keep programming. Programming used to be (and still is, to a large extent) an activity that can be done with open and free tools. I am afraid that in a few years, that will no longer be possible (as in most programmers will be so tied to a paid LLM, that not using them would be like not using an IDE or vim nowadays), since everyone is using private LLMs. The excuse "but you earn six figures, what' $200/month to you?" doesn't really capture the issue here.

replies(46): >>44625521 #>>44625545 #>>44625564 #>>44625827 #>>44625858 #>>44625864 #>>44625902 #>>44625949 #>>44626014 #>>44626067 #>>44626198 #>>44626312 #>>44626378 #>>44626479 #>>44626511 #>>44626543 #>>44626556 #>>44626981 #>>44627197 #>>44627415 #>>44627574 #>>44627684 #>>44627879 #>>44628044 #>>44628982 #>>44629019 #>>44629132 #>>44629916 #>>44630173 #>>44630178 #>>44630270 #>>44630351 #>>44630576 #>>44630808 #>>44630939 #>>44631290 #>>44632110 #>>44632489 #>>44632790 #>>44632809 #>>44633267 #>>44633559 #>>44633756 #>>44634841 #>>44635028 #>>44636374 #
simonw ◴[] No.44626556[source]
The models I can run locally aren't as good yet, and are way more expensive to operate.

Once it becomes economical to run a Claude 4 class model locally you'll see a lot more people doing that.

The closest you can get right now might be Kimi K2 on a pair of 512GB Mac Studios, at a cost of about $20,000.

replies(12): >>44627184 #>>44627617 #>>44627695 #>>44627852 #>>44628143 #>>44631034 #>>44631098 #>>44631352 #>>44631995 #>>44632684 #>>44633226 #>>44644288 #
zer00eyz ◴[] No.44627695[source]
> Once it becomes economical to run a Claude 4 class model locally you'll see a lot more people doing that.

Historically these sorts of things happened because of Moores law. Moores law is dead. For a while we have scaled on the back of "more cores", and process shrink. It looks like we hit the wall again.

We seem to be near the limit of scaling (physics) we're not seeing a lot in clock (some but not enough), and IPC is flat. We are also having power (density) and cooling (air wont cut it any more) issues.

The requirements to run something like claud 4 local aren't going to make it to house hold consumers any time soon. Simply put the very top end of consumer PC's looks like 10 year old server hardware, and very few people are running that because there isn't a need.

The only way we're going to see better models locally is if there is work (research, engineering) put into it. To be blunt that isnt really happening, because Fb/MS/Google are scaling in the only way they know how. Throw money at it to capture and dominate the market, lock out the innovators from your API and then milk the consumer however you can. Smaller, and local is antithetical to this business model.

Hoping for the innovation that gives you a moat, that makes you the next IBM isnt the best way to run a business.

Based on how often Google cancels projects, based on how often the things Zuck swear are "next" face plant (metaverse) one should not have a lot of hope about AI>

replies(3): >>44627840 #>>44628024 #>>44630780 #
Aurornis ◴[] No.44630780[source]
> We seem to be near the limit of scaling (physics) we're not seeing a lot in clock (some but not enough), and IPC is flat. We are also having power (density) and cooling (air wont cut it any more) issues.

This is exaggeration. CPUs are still getting faster. IPC is increasing, not flat. Cooling on air is fine unless you’re going for high density or low noise.

This is just cynicism. Even an M4 MacBook Pro is substantially faster than an M1 from a few years ago, which is substantially faster than the previous versions.

Server chips are scaling core counts and bandwidth. GPUs are getting faster and faster.

The only way you could conclude scaling is dead is if you ignored all recent progress or you’re expecting improvements at an unrealistically fast rate.

replies(1): >>44640912 #
1. zer00eyz ◴[] No.44640912[source]
> IPC is increasing, not flat.

Benchmarks going up is not IPC increasing. These are separate things.

Please look IPC for the latest GPU's from Nvidia, the latest CPU's from AMD. The IPC is flat. See intel loosing credibility with failing processors due to power problems from clocking because IPC is flat.

> Even an M4 MacBook Pro is substantially faster than an M1

Again, clocking. m4 (non pro) vs m1 are so close in IPC on common tasks that its negligible. The performance gains between the two are from memory bandwidth not core performance.

> Server chips are scaling core counts

Parallelism is not the same as performance. Intel dropping the "core duo" 20 year ago was that RUNNING at 2ghz was an admission that single threading was ending. 20 years on were 20 cores deep (consumer), and only at 4ghz with "boost clocks" (back to that pesky power and cooling problem).

And this product still exists today: the N150 (close enough). Its has lower power consumption and more cores. And what was the single core performance gain? 35% Improvement in 20 years.

None of these things are running any of the LLM's that power the tools were talking about. Those are in the datacenter. 700 core CPU's, 400-800gbps top of rack switching are the bleeding edge. This is where "power" and cooling have hit the wall. The spacing requirements of a bleeding edge NVIDIA install are impacting the costs of interconnect between systems. Lots of fiber and needing to be spaced out because of power/heat adds up to a boat load of extra networking costs. Having half empty racks because of density is now a reality.

And you see these same issues at home: power demands of GPU's for consumers and workstations are thought he roof. Were past what the PCI spec can provide, all that power is heat and has to go somewhere. Sometimes it burns up poorly designed connectors. The latest gen is consumes even more power, to push clocks higher, for very little gain (see flat IPC nvida).