The AMD Radeon Instinct MI300A's Giant Memory Subsystem

1. behnamoh ◴[18 Jan 25 18:21 UTC] No.42750215[source]▶

AMD is done, no one uses their GPUs for AI because AMD were too dumb to understand the value of software lock-in like Nvidia did with CUDA.

replies(3): >>42750458 #>>42753721 #>>42756635 #

2. guywhocodes ◴[18 Jan 25 18:54 UTC] No.42750458[source]▶

>>42750215 (TP) #

More like the value of drivers that doesn't require one in-house team per customer to "fix" driver crashes in the customers' particular workloads.

replies(1): >>42751175 #

3. numpy-thagoras ◴[18 Jan 25 20:29 UTC] No.42751175[source]▶

>>42750458 #

Yeah, the labour involved in running non Nvidia equipment is the elephant in the room.

Nvidia GPU: spin up OS, run your sims or load your LLM, gather results.

AMD GPU: spin up OS, grok driver fixes, try and run your sims, grok more driver fixes, can't even gather results until you can verify software correctness of your fixes. Yeah, sometimes you need someone with specialized knowledge of numerical methods to help tune your fixes.

... What kind of maddening workflows are these? It's literally negative work: you are busy, you barely get anywhere, and you end up having to do more.

In light of that, the Nvidia tax doesn't look so bad.

4. DiabloD3 ◴[19 Jan 25 04:45 UTC] No.42753721[source]▶

>>42750215 (TP) #

Funny you say that, because nobody serious about AI is actually using Nvidia unless they're already locked in with CUDA.

Highest performing inference engines all use Vulkan, and are either faster per dollarwatt on the CDNA3 cards or (surprisingly) the RDNA3 cards, not Lovelace.

replies(1): >>42753978 #

5. behnamoh ◴[19 Jan 25 05:28 UTC] No.42753978[source]▶

>>42753721 #

> Funny you say that, because nobody serious about AI is actually using Nvidia unless they're already locked in with CUDA.

Yeah right, so Meta and XAI buying hundreds of Nvidia's H100's was because they were not serious in AI. wtf

replies(1): >>42756334 #

6. DiabloD3 ◴[19 Jan 25 12:09 UTC] No.42756334{3}[source]▶

>>42753978 #

Meta has an in-house accelerator that the Triton inference engine supports (which they use almost exclusively for their fake content/fake profiles project). Triton is legacy software and, afaik, does not have a Vulcan backend, so Meta may be locked out of better options until it does.

That doesn't stop Meta's Llama family of models running on anything and everything _outside_ of Meta, though. Llama.cpp works on everything, for example, but Meta doesn't use it.

7. buyucu ◴[19 Jan 25 13:06 UTC] No.42756635[source]▶

>>42750215 (TP) #

CUDA lock-in is not what it once was. I do a lot of stable diffusion and I was pleasantly suprised that I could just run the same code on AMD with no changes.