> IPC is increasing, not flat.
Benchmarks going up is not IPC increasing. These are separate things.
Please look IPC for the latest GPU's from Nvidia, the latest CPU's from AMD. The IPC is flat. See intel loosing credibility with failing processors due to power problems from clocking because IPC is flat.
> Even an M4 MacBook Pro is substantially faster than an M1
Again, clocking. m4 (non pro) vs m1 are so close in IPC on common tasks that its negligible. The performance gains between the two are from memory bandwidth not core performance.
> Server chips are scaling core counts
Parallelism is not the same as performance. Intel dropping the "core duo" 20 year ago was that RUNNING at 2ghz was an admission that single threading was ending. 20 years on were 20 cores deep (consumer), and only at 4ghz with "boost clocks" (back to that pesky power and cooling problem).
And this product still exists today: the N150 (close enough). Its has lower power consumption and more cores. And what was the single core performance gain? 35% Improvement in 20 years.
None of these things are running any of the LLM's that power the tools were talking about. Those are in the datacenter. 700 core CPU's, 400-800gbps top of rack switching are the bleeding edge. This is where "power" and cooling have hit the wall. The spacing requirements of a bleeding edge NVIDIA install are impacting the costs of interconnect between systems. Lots of fiber and needing to be spaced out because of power/heat adds up to a boat load of extra networking costs. Having half empty racks because of density is now a reality.
And you see these same issues at home: power demands of GPU's for consumers and workstations are thought he roof. Were past what the PCI spec can provide, all that power is heat and has to go somewhere. Sometimes it burns up poorly designed connectors. The latest gen is consumes even more power, to push clocks higher, for very little gain (see flat IPC nvida).