←back to thread

183 points spacebanana7 | 5 comments | | HN request time: 0s | source

I appreciate developing ROCm into something competitive with CUDA would require a lot of work, both internally within AMD and with external contributions to the relevant open source libraries.

However the amount of resources at stake is incredible. The delta between NVIDIA's value and AMD's is bigger than the annual GDP of Spain. Even if they needed to hire a few thousand engineers at a few million in comp each, it'd still be a good investment.

Show context
fancyfredbot ◴[] No.43547461[source]
There is more than one way to answer this.

They have made an alternative to the CUDA language with HIP, which can do most of the things the CUDA language can.

You could say that they haven't released supporting libraries like cuDNN, but they are making progress on this with AiTer for example.

You could say that they have fragmented their efforts across too many different paradigms but I don't think this is it because Nvidia also support a lot of different programming models.

I think the reason is that they have not prioritised support for ROCm across all of their products. There are too many different architectures with varying levels of support. This isn't just historical. There is no ROCm support for their latest AI Max 395 APU. There is no nice cross architecture ISA like PTX. The drivers are buggy. It's just all a pain to use. And for that reason "the community" doesn't really want to use it, and so it's a second class citizen.

This is a management and leadership problem. They need to make using their hardware easy. They need to support all of their hardware. They need to fix their driver bugs.

replies(6): >>43547568 #>>43547675 #>>43547799 #>>43547827 #>>43549724 #>>43558036 #
thrtythreeforty ◴[] No.43547568[source]
This ticket, finally closed after being open for 2 years, is a pretty good micocosm of this problem:

https://github.com/ROCm/ROCm/issues/1714

Users complaining that the docs don't even specify which cards work.

But it goes deeper - a valid complaint is that "this only supports one or two consumer cards!" A common rebuttal is that it works fine on lots of AMD cards if you set some environment flag to force the GPU architecture selection. The fact that this is so close to working on a wide variety of hardware, and yet doesn't, is exactly the vibe you get with the whole ecosystem.

replies(6): >>43547700 #>>43547940 #>>43547988 #>>43548203 #>>43549097 #>>43550313 #
iforgotpassword ◴[] No.43549097[source]
What I don't get is why they don't at least assign a dev or two to make the poster child of this work: llama.cpp

It's the first thing anyone tries when trying to dabble in AI or compute on the gpu, yet it's a clusterfuck to get to work. A few blessed cards work, with proper drivers and kernel; others just crash, perform horribly slow, or output GGGGGGGGGGGGGG to every input (I'm not making this up!) Then you LOL, dump it and go buy nvidia et voila, stuff works first try.

replies(1): >>43553556 #
wkat4242 ◴[] No.43553556[source]
It does work, I have it running on my Radeon VII Pro
replies(1): >>43555944 #
Filligree ◴[] No.43555944[source]
It sometimes works.
replies(1): >>43557032 #
1. wkat4242 ◴[] No.43557032[source]
How so? It's rock solid for me. I use ollama but it's based on llama.cpp

It's quite fast also, probably because that card has fast HBM2 memory (it has the same memory bandwidth as a 4090). And it was really cheap as it was on deep sale as an outgoing model.

replies(2): >>43558171 #>>43560454 #
2. Filligree ◴[] No.43558171[source]
"Sometimes" as in "on some cards". You're having luck with yours, but that doesn't mean it's a good place to build a community.
replies(1): >>43560427 #
3. wkat4242 ◴[] No.43560427[source]
Ah I see. Yes, but you pick the card for the purpose of course. I also don't like the way they have such limited support on ROCm. But when it works it works well.

I have Nvidia cards too by the way, a 4090 and a 3060 (the latter I use for AI also, but more for Whisper because faster-whisper doesn't do ROCm right now).

4. halJordan ◴[] No.43560454[source]
Aside from the fact that gfx906 is one of the blessed architecture mentioned (so why would it not work). Like how do you look at your specific instance and then turn around and say "All of you are lying, it works perfectly." How do you square that circle in your head
replies(1): >>43564900 #
5. wkat4242 ◴[] No.43564900[source]
No I was just a bit thrown by the "sometimes". I thought they were referring to a reliability issue. I am aware of the limited card support with ROCm and I complained about this elsewhere in the thread too.

Also I didn't accuse anyone of lying. No need to be so confrontational. And my remark to the original poster at the top was from before they clarified their post.

I just don't really see what AMD can do to make ollama work better other than porting ROCm to all their cards which is definitely something they should do.

And no I'm not an AMD fanboi. I have no loyalty to anyone, any company or any country.