←back to thread

195 points rbanffy | 1 comments | | HN request time: 0.213s | source
Show context
pie420 ◴[] No.42176400[source]
layperson with no industry knowledge, but it seems like nvidia's CUDA moat will fall in the next 2-5 years. It seems impossible to sustain those margins without competition coming in and getting a decent slice of the pie
replies(5): >>42176440 #>>42177575 #>>42177944 #>>42178259 #>>42179625 #
metadat ◴[] No.42176440[source]
But how will AMD or anyone else push in? CUDA is actually a whole virtualization layer on top of the hardware and isn't easily replicable, Nvidia has been at it for 17 years.

You are right, eventually something's gotta give. The path for this next leg isn't yet apparent to me.

P.s. how much is an exaflop or petaflop, and how significant is it? The numbers thrown around in this article don't mean anything to me. Is this new cluster way more powerful than the last top?

replies(14): >>42176567 #>>42176711 #>>42176809 #>>42177061 #>>42177287 #>>42177319 #>>42177378 #>>42177451 #>>42177452 #>>42177477 #>>42177479 #>>42178108 #>>42179870 #>>42180214 #
NineStarPoint ◴[] No.42177479[source]
A high grade consumer gpu a (a 4090) is about 80 teraflops. So rounding up to 100, an exaflop is about 10,000 consumer grade cards worth of compute, and a petaflop is about 10.

Which doesn’t help with understanding how much more impressive these are than the last clusters, but does to me at least put the amount of compute these clusters have into focus.

replies(2): >>42177621 #>>42177989 #
1. winwang ◴[] No.42177989[source]
4090 tensor performance (FP8): 660 teraflops, 1320 "with sparsity" (i.e. max theoretical with zeroes in the right places).

https://images.nvidia.com/aem-dam/Solutions/geforce/ada/nvid...

But at these levels of compute, the memory/interconnect bandwidth becomes the bottleneck.