(paulkedrosky.com)

337 points throw0101c | 1 comments | 18 Jul 25 19:59 UTC | HN request time: 0.207s | source

Show context

mikewarot ◴[18 Jul 25 21:11 UTC] No.44609878[source]▶

I'm waiting for the shoe to drop when someone comes out with an FPGA optimized for reconfigurable computing and lowers the cost of llm compute by 90% or better.

replies(6): >>44609898 #>>44609932 #>>44610004 #>>44610118 #>>44610319 #>>44610367 #

pdhborges ◴[18 Jul 25 21:14 UTC] No.44609898[source]▶

>>44609878 #

Where is that improvement coming from? Hardware is already here to compute gemm as fast as possible.

replies(1): >>44609936 #

1. leakyfilter ◴[18 Jul 25 21:17 UTC] No.44609936[source]▶

>>44609898 #

Raw gemm computation was never the real bottleneck, especially on the newer GPUs. Feeding the matmuls i.e memory bandwidth is where it’s at, especially in the newer GPUs.

↑

AI capex is so big that it's affecting economic statistics