AMD is done, no one uses their GPUs for AI because AMD were too dumb to understand the value of software lock-in like Nvidia did with CUDA.
replies(3):
Highest performing inference engines all use Vulkan, and are either faster per dollarwatt on the CDNA3 cards or (surprisingly) the RDNA3 cards, not Lovelace.
Yeah right, so Meta and XAI buying hundreds of Nvidia's H100's was because they were not serious in AI. wtf
That doesn't stop Meta's Llama family of models running on anything and everything _outside_ of Meta, though. Llama.cpp works on everything, for example, but Meta doesn't use it.