←back to thread

85 points homarp | 1 comments | | HN request time: 0.209s | source
Show context
semessier ◴[] No.44611583[source]
I had bet that matmult would be in transformer-optimized hardware costing a fraction of GPUs first class in torch 2 years ago with no reason to use GPUs any more. Wrong.
replies(2): >>44611628 #>>44613343 #
1. gchadwick ◴[] No.44613343[source]
The real bottleneck is the memory, optimize your matmul architecture all you like whilst you still have it connected to a big chunk of HBM memory (or whatever your chosen high bandwidth memory is) you can only do so much.

So really GPU v not GPU (e.g. TPU) doesn't matter a whole lot if you've got fundamentally the same memory architecture.