←back to thread

195 points rbanffy | 3 comments | | HN request time: 1.434s | source
Show context
pie420 ◴[] No.42176400[source]
layperson with no industry knowledge, but it seems like nvidia's CUDA moat will fall in the next 2-5 years. It seems impossible to sustain those margins without competition coming in and getting a decent slice of the pie
replies(5): >>42176440 #>>42177575 #>>42177944 #>>42178259 #>>42179625 #
metadat ◴[] No.42176440[source]
But how will AMD or anyone else push in? CUDA is actually a whole virtualization layer on top of the hardware and isn't easily replicable, Nvidia has been at it for 17 years.

You are right, eventually something's gotta give. The path for this next leg isn't yet apparent to me.

P.s. how much is an exaflop or petaflop, and how significant is it? The numbers thrown around in this article don't mean anything to me. Is this new cluster way more powerful than the last top?

replies(14): >>42176567 #>>42176711 #>>42176809 #>>42177061 #>>42177287 #>>42177319 #>>42177378 #>>42177451 #>>42177452 #>>42177477 #>>42177479 #>>42178108 #>>42179870 #>>42180214 #
1. vlovich123 ◴[] No.42176711[source]
The API part isn't thaaat hard. Indeed HIP already works pretty well at getting existing CUDA code to work unmodified on AMD HW. The bigger challenge is that the AMD and Nvidia architectures are so different that the optimization choices for what the kernels would look like are more different between Nvidia and AMD than they would be between Intel and AMD in CPU land even including SIMD.
replies(1): >>42182561 #
2. pjmlp ◴[] No.42182561[source]
Only if the only thing one cares about is CUDA C++, and not CUDA C, CUDA C++, CUDA Fortran, CUDA Anything PTX, plus libraries, IDE integration, GPU graphical debugging.
replies(1): >>42186939 #
3. vlovich123 ◴[] No.42186939[source]
CUDA C works fine with HIP not sure what you're referring to. As for the other pieces, GPU graphical debugging isn't relevant for CUDA and I don't know what IDE integration is special / relevant for CUDA but AMD does have a ROCm debugger which I would imagine would be sufficient for simultaneous debugging of CPU & GPU. You won't get developer tools like nsight systems but I'm pretty sure AMD has equivalent tooling.

As for Fortran, that doesn't come up much in modern AI stuff. I haven't observed PTX / GCN assembly within AI codebases but maybe you have extra insight there.