Most active commenters

pjmlp(6)
BizarroLand(5)
incrudible(4)
swozey(4)
hjabird(3)
paulmd(3)
coderenegade(3)

Popular/hot comments

>>39347964 #
>>39346142 #
>>39346889 #
>>39346941 #
>>39347325 #
>>39347633 #

←back to thread

AMD funded a drop-in CUDA implementation built on ROCm: It's now open-source

(www.phoronix.com)

Show context

btown ◴[12 Feb 24 14:37 UTC] No.39345221[source]▶

>>39344815 (OP) #

Why would this not be AMD’s top priority among priorities? Someone recently likened the situation to an Iron Age where NVIDIA owns all the iron. And this sounds like AMD knowing about a new source of ore and not even being willing to sink a single engineer’s salary into exploration.

My only guess is they have a parallel skunkworks working on the same thing, but in a way that they can keep it closed-source - that this was a hedge they think they no longer need, and they are missing the forest for the trees on the benefits of cross-pollination and open source ethos to their business.

replies(14): >>39345241 #>>39345302 #>>39345393 #>>39345400 #>>39345458 #>>39345853 #>>39345857 #>>39345893 #>>39346210 #>>39346792 #>>39346857 #>>39347433 #>>39347900 #>>39347927 #

1. hjabird ◴[12 Feb 24 15:26 UTC] No.39345853[source]▶

>>39345221 #

The problem with effectively supporting CUDA is that encourages CUDA adoption all the more strongly. Meanwhile, AMD will always be playing catch-up, forever having to patch issues, work around Nvidia/AMD differences, and accept the performance penalty that comes from having code optimised for another vendor's hardware. AMD needs to encourage developers to use their own ecosystem or an open standard.

replies(13): >>39345944 #>>39346147 #>>39346166 #>>39346182 #>>39346270 #>>39346295 #>>39346339 #>>39346835 #>>39346941 #>>39346971 #>>39347964 #>>39348398 #>>39351785 #

2. slashdev ◴[12 Feb 24 15:32 UTC] No.39345944[source]▶

>>39345853 (TP) #

With Nvidia controlling 90%+ of the market, this is not a viable option. They'd better lean hard into CUDA support if they want to be relevant.

replies(1): >>39346142 #

3. cduzz ◴[12 Feb 24 15:45 UTC] No.39346142[source]▶

>>39345944 #

A bit of story telling here:

IBM and Microsoft made OS/2. The first version worked on 286s and was stable but useless.

The second version worked only on 386s and was quite good, and even had wonderful windows 3.x compatibility. "Better windows than windows!"

At that point Microsoft wanted out of the deal and they wanted to make their newer version of windows, NT, which they did.

IBM now had a competitor to "new" windows and a very compatible version of "old" windows. Microsoft killed OS2 by a variety of ways (including just letting IBM be IBM) but also by making it very difficult for last month's version of OS/2 to run next month's bunch of Windows programs.

To bring this back to the point -- IBM vs Microsoft is akin to AMD vs Nvidia -- where nvidia has the standard that AMD is implementing, and so no matter what if you play in the backward compatibility realm you're always going to be playing catch-up and likely always in a position where winning is exceedingly hard.

As WOPR once said "interesting game; the only way to win is to not play."

replies(4): >>39346304 #>>39346399 #>>39347110 #>>39348097 #

4. bachmeier ◴[12 Feb 24 15:45 UTC] No.39346147[source]▶

>>39345853 (TP) #

> The problem with effectively supporting CUDA is that encourages CUDA adoption all the more strongly.

I'm curious about this. Sure some CUDA code has already been written. If something new comes along that provides better performance per dollar spent, why continue writing CUDA for new projects? I don't think the argument that "this is what we know how to write" works in this case. These aren't scripts you want someone to knock out quickly.

replies(2): >>39346290 #>>39346821 #

5. panick21_ ◴[12 Feb 24 15:47 UTC] No.39346166[source]▶

>>39345853 (TP) #

That's not guaranteed at all. One could make the same argument about Linux vs Commercial Unix.

If the put their stuff as OpenSource, including firmware, I think they will win out eventually.

And its also not a guarantee that Nvidia will always produce the superior hardware for that code.

6. kgeist ◴[12 Feb 24 15:48 UTC] No.39346182[source]▶

>>39345853 (TP) #

Intel embraced Amd64 ditching Itanium. Wasn't it a good decision that worked out well? Is it comparable?

replies(2): >>39346627 #>>39346836 #

7. throwoutway ◴[12 Feb 24 15:54 UTC] No.39346270[source]▶

>>39345853 (TP) #

Is it? Apple Silicon exists, but Apple created a translation layer above it so the transition could be smoother.

replies(2): >>39346421 #>>39346493 #

8. Uehreka ◴[12 Feb 24 15:55 UTC] No.39346290[source]▶

>>39346147 #

> If something new comes along that provides better performance per dollar spent

They won’t be able to do that, their hardware isn’t fast enough.

Nvidia is beating them at hardware performance, AND ALSO has an exclusive SDK (CUDA) that is used by almost all deep learning projects. If AMD can get their cards to run CUDA via ROCm, then they can begin to compete with Nvidia on price (though not performance). Then, and only then, if they can start actually producing cards with equivalent performance (also a big stretch) they can try for an Embrace Extend Extinguish play against CUDA.

replies(1): >>39346889 #

9. coldtea ◴[12 Feb 24 15:55 UTC] No.39346295[source]▶

>>39345853 (TP) #

>The problem with effectively supporting CUDA is that encourages CUDA adoption all the more strongly

Worked fine for MS with Excel supporting Lotus 123 and Word supporting WordPerfect's formats when those were dominant...

replies(2): >>39346449 #>>39346457 #

10. panick21_ ◴[12 Feb 24 15:56 UTC] No.39346304{3}[source]▶

>>39346142 #

IBM also made a whole bunch of strategic mistakes beyond that. Most importantly their hardware division didn't give a flying f about OS/2. Even when they had a 'better Windows' they did not actually use it themselves and didn't push it to other vendors.

Windows NT wasn't really relevant in that competition for much longer, only XP was finally for end consumers.

> where nvidia has the standard that AMD is implementing, and so no matter what if you play in the backward compatibility realm you're always going to be playing catch-up

That's not true. If AMD starts adding their own features and have their own advantages, that can flip.

It only takes a single generation of hardware, or a single feature for things to flip.

Look at Linux and Unix. Its started out with Linux implementing Unix, and now the Unix are trying to add compatibility with with Linux.

Is SGI still the driving force behind OpenGL/Vulcan? Did you think it was a bad idea for other companies to use OpenGL?

AMD was successful against Intel with x86_64.

There are lots of example of the company making something popular, not being able to take full advantage of it in the long run.

replies(1): >>39346623 #

11. andy_ppp ◴[12 Feb 24 15:58 UTC] No.39346339[source]▶

>>39345853 (TP) #

When the alternative is failure I suppose you choose the least bad option. Nobody is betting the farm on ROCm!

replies(1): >>39346686 #

12. incrudible ◴[12 Feb 24 16:03 UTC] No.39346399{3}[source]▶

>>39346142 #

Windows before NT was crap, so users had an incentive to upgrade. If there had existed a Windows 7 alternative that was near fully compatible and FOSS, I would wager Microsoft would have lost to it with Windows 8 and even 10. The only reason to update for most people was Microsoft dropping support.

For CUDA, it is not just AMD who would need to catch up. Developers also are not necessarily going to target the latest feature set immediately, especially if it only benefits (or requires) new hardware.

I accept the final statement, but that also means AMD for compute is gonna be dead like OS/2. Their stack just will not reach critical mass.

replies(1): >>39347325 #

13. jack_pp ◴[12 Feb 24 16:04 UTC] No.39346421[source]▶

>>39346270 #

not really the same in that Apple was absolutely required to do this in order for people to transition smoothly and it wasn't competing against another company / platform, it just needed apps from its previous platform to work while people recompile apps for the current one which they will

14. Dork1234 ◴[12 Feb 24 16:06 UTC] No.39346449[source]▶

>>39346295 #

Microsoft could do that because they had the Operating System monopoly to leverage and take out both Lotus 123 and WordPerfect. Without the monopoly of the operating system they wouldn't of been able to Embrace, Extend, Extinguish.

https://en.wikipedia.org/wiki/Embrace,_extend,_and_extinguis...

15. bell-cot ◴[12 Feb 24 16:07 UTC] No.39346457[source]▶

>>39346295 #

But MS controlled the underlying OS. Letting them both throw money at the problem, and (by accounts at the time) frequently tweak the OS in ways that made life difficult for Lotus, WordPerfect, Ashton-Tate, etc.

replies(1): >>39346585 #

16. Jorropo ◴[12 Feb 24 16:09 UTC] No.39346493[source]▶

>>39346270 #

This is extremely different, apple was targeting end consumers that just want their app to run. The performance between apple rosetta and native cpu were still multiple times different.

People writing CUDA apps don't just want stuff to run, performance is an extremely important factor else they would target CPUs which are easier to program for.

From their readme: > On Server GPUs, ZLUDA can compile CUDA GPU code to run in one of two modes: > Fast mode, which is faster, but can make exotic (but correct) GPU code hang. > Slow mode, which should make GPU code more stable, but can prevent some applications from running on ZLUDA.

replies(2): >>39346735 #>>39346899 #

17. p_l ◴[12 Feb 24 16:14 UTC] No.39346585{3}[source]▶

>>39346457 #

Last I checked, Lotus did themselves by not innovating, and betting on the wrong horse (OS/2) then not doing well on a pivot to Windows.

Meanwhile Excel was gaining features and winning users with them even before Windows was in play.

replies(2): >>39347201 #>>39347244 #

18. chuckadams ◴[12 Feb 24 16:17 UTC] No.39346623{4}[source]▶

>>39346304 #

Slapping a price tag of over $300 on OS/2 didn’t do IBM any favors either.

replies(1): >>39347215 #

19. teucris ◴[12 Feb 24 16:17 UTC] No.39346627[source]▶

>>39346182 #

In hindsight, yes, but just because a specific technology is leading an industry doesn’t mean it’s going to be the best option. It has to play out long enough for the market to indicate a preference. In this case, for better or worse, it looks like CUDA’s the preference.

replies(1): >>39346692 #

20. hjabird ◴[12 Feb 24 16:21 UTC] No.39346686[source]▶

>>39346339 #

True. This is the big advantage of an open standard instead jumping from one vendors walled garden to another.

21. diggan ◴[12 Feb 24 16:22 UTC] No.39346692{3}[source]▶

>>39346627 #

> It has to play out long enough for the market to indicate a preference

By what measures hasn't that happened already? CUDA been around and constantly improving for more than 15 years, and there is no competitors in sight so far. It's basically the de facto standard in many ecosystems.

replies(1): >>39347461 #

22. hamandcheese ◴[12 Feb 24 16:24 UTC] No.39346735{3}[source]▶

>>39346493 #

> The performance between apple rosetta and native cpu were still multiple times different.

Rosetta 2 runs apps at 80-90% their native speed.

replies(1): >>39349772 #

23. dotnet00 ◴[12 Feb 24 16:30 UTC] No.39346821[source]▶

>>39346147 #

If something new comes along that provides better performance per dollar, but you have no confidence that it'll continue to be available in the future, it's far less appealing. There's also little point in being cheaper if it just doesn't have the raw performance to justify the effort in implementing in that language.

CUDA currently has the better raw performance, better availability, and a long record indicating that the platform won't just disappear in a couple of years. You can use it on pretty much any NVIDIA GPU and it's properly supported. The same CUDA code that ran on a GTX680 can run on an RTX4090 with minimal changes if any (maybe even the same binary).

In comparison, AMD has a very spotty record with their compute technologies, stuff gets released and becomes effectively abandonware, or after just a few years support gets dropped regardless of the hardware's popularity. For several generations they basically led people on with promises of full support on consumer hardware that either never arrived or arrived when the next generation of cards were already available, and despite the general popularity of the rx580 and the popularity of the Radeon VII in compute applications, they dropped 'official' support. AMD treats its 'consumer' cards as third class citizens for compute support, but you aren't going to convince people to seriously look into your platform like that. Plus, it's a lot more appealing to have "GPU acceleration will allow us to take advantage of newer supercomputers, while also offering massive benefits to regular users" than just the former.

This was ultimately what removed AMD as a consideration for us when we were deciding on which to focus on for GPU acceleration in our application. Many of us already had access to an NVIDIA GPU of any sort, which would make development easier, while the entire facility had one ROCm capable AMD GPU at the time, specifically so they could occasionally check in on its status.

24. more_corn ◴[12 Feb 24 16:31 UTC] No.39346835[source]▶

>>39345853 (TP) #

They have already lost. The question is do they want to come in second in the game to control the future of the world or not play at all?

25. kllrnohj ◴[12 Feb 24 16:31 UTC] No.39346836[source]▶

>>39346182 #

Intel & AMD have a cross-license agreement covering everything x86 (and x86_64) thanks to lots and lots of lawsuits over their many years of competition.

So while Intel had to bow to AMD's success and give up Itanium, they weren't then limited by that and could proceed to iterate on top of it.

Meanwhile it'll be a cold day in hell before Nvidia licenses anything about CUDA to AMD, much less allows AMD to iterate on top of it.

replies(2): >>39348437 #>>39349815 #

26. bachmeier ◴[12 Feb 24 16:36 UTC] No.39346889{3}[source]▶

>>39346290 #

> They won’t be able to do that, their hardware isn’t fast enough.

Well, then I guess CUDA is not really the problem, so being able to run CUDA on AMD hardware wouldn't solve anything.

> try for an Embrace Extend Extinguish play against CUDA

They wouldn't need to go that route. They just need a way to run existing CUDA code on AMD hardware. Once that happens, their customers have the option to save money by writing ROCm or whatever AMD is working on at that time.

replies(4): >>39347039 #>>39347129 #>>39349668 #>>39356235 #

27. piva00 ◴[12 Feb 24 16:36 UTC] No.39346899{3}[source]▶

>>39346493 #

> The performance between apple rosetta and native cpu were still multiple times different.

Not at all, the performance hit was in the low 10s %, before natively supporting Apple Silicon most of the apps I use for music/video/photography didn't seem to have a performance impact at all, even more when the M1 machines were so much faster than the Intels.

28. bick_nyers ◴[12 Feb 24 16:39 UTC] No.39346941[source]▶

>>39345853 (TP) #

The latest version of CUDA is 12.3, and version 12.2 came out 6 months prior. How many people are running an older version of CUDA right now on NVIDIA hardware for whatever particular reason?

Even if AMD lagged support on CUDA versioning, I think it would be widely accepted if the performance per dollar at certain price points was better.

Taking the whole market from NVIDIA is not really an option, it's better to attack certain price points and niches and then expand from there. The CUDA ship sailed a long time ago in my view.

replies(3): >>39347633 #>>39348092 #>>39348793 #

29. hjabird ◴[12 Feb 24 16:41 UTC] No.39346971[source]▶

>>39345853 (TP) #

There are some great replies to my comment - my original comment was too reductive. However, I still think that entrenching CUDA as the de-facto language for heterogeneous computing is a mistake. We need an open ecosystem for AI and HPC, where vendors compete on producing the best hardware.

replies(1): >>39347760 #

30. ◴[12 Feb 24 16:45 UTC] No.39347039{4}[source]▶

>>39346889 #

31. foobiekr ◴[12 Feb 24 16:50 UTC] No.39347110{3}[source]▶

>>39346142 #

IBM was also incompetent and the os/2 team in Boca was had some exceptional engineers but was packed witg mostly mediocre-to-bad ones, which is why so many things in OS/2 were bad and why IBM got upset for Microsoft contributing negative work to the project because their lines of code contribution was negative (they were rewriting a lot of inefficient bloated IBM code).

A lot went wrong with os/2. For CUDA, I think a better analogy is vhs. The standard, in the effective not open sense, is what it is. AMD sucks at software and views it as an expense rather than an advantage.

replies(1): >>39347469 #

32. foobiekr ◴[12 Feb 24 16:52 UTC] No.39347129{4}[source]▶

>>39346889 #

Intel has the same software issue as AMD but their hardware is genuinely competitive if a generation behind. Cost and power wise, Intel is there; software? No.

33. dadadad100 ◴[12 Feb 24 16:56 UTC] No.39347201{4}[source]▶

>>39346585 #

This is a key point. Before windows we had all the dos players - WordPerfect was king. Microsoft was more focused on the Mac. I’ve always assumed that Microsoft understood that a GUI was coming and trained a generation of developers on the main gui of the day. Once windows came out the dos focused apps could not adapt in time

34. BizarroLand ◴[12 Feb 24 16:57 UTC] No.39347215{5}[source]▶

>>39346623 #

That's what happens when your primary business model is selling to the military. They had to pay what IBM charged them (within a small bit of reason) and it was incredibly difficult for them to pivot away from any path they chose in the 80's once they had chosen it.

However, that same logic doesn't apply to consumers, and since they continued to fail to learn that lesson now IBM doesn't even target the consumer market given that they never learned how to be competitive and could only ever effectively function when they had a monopoly or at least a vendor lock-in.

https://en.wikipedia.org/wiki/Acquisition_of_the_IBM_PC_busi...

35. robocat ◴[12 Feb 24 16:59 UTC] No.39347244{4}[source]▶

>>39346585 #

> betting on the wrong horse (OS/2)

Ahhhh, your hindsight is well developed. I would be interested to know the background on the reasons why Lotus made that bet. We can't know the counterfactual, but Lotus delivering on a platform owned by their deadly competitor Microsoft would seem to me to be a clearly worrysome idea to Lotus at the time. Turned out it was an existentially bad idea. Did Lotus fear Microsoft? "DOS ain't done till Lotus won't run" is a myth[1] for a reason. Edit: DRDOS errors[2] were one reason Lotus might fear Microsoft. We can just imagine a narritive of a different timeline where Lotus delivered on Windows but did some things differently to beat Excel. I agree, Lotus made other mistakes and Microsoft made some great decisions, but the point remains.

We can also suspect that AMD have a similar choice now where they are forked. Depending on Nvidea/CUDA may be a similar choice for AMD - fail if they do and fail if they don't.

[1] http://www.proudlyserving.com/archives/2005/08/dos_aint_done...

[2] https://www.theregister.com/1999/11/05/how_ms_played_the_inc...

replies(1): >>39347497 #

36. BizarroLand ◴[12 Feb 24 17:04 UTC] No.39347325{4}[source]▶

>>39346399 #

Todays linux OS's would have competed incredibly strongly against Vista and probably would have gone blow for blow against 7.

Proton, Wine, and all of the compatibility fixes and drive improvements that the community has made in the last 16 years has been amazing, and every day is another day where you can say that it has never been easier to switch away from Windows.

However, Microsoft has definitely been drinking the IBM koolaid a little to long and has lost the mandate of heaven. I think in the next 7-10 years we will reach a point where there is nothing Windows can do that linux cannot do better and easier without spying on you, and we may be 3-5 years from a "killer app" that is specifically built to be incompatible with Windows just as a big FU to them, possibly in the VR world, possibly in AR, and once that happens maybe, maybe, maybe it will finally actually be the year of the linux desktop.

replies(3): >>39348271 #>>39348906 #>>39353772 #

37. teucris ◴[12 Feb 24 17:13 UTC] No.39347461{4}[source]▶

>>39346692 #

There haven’t been any as successful, but there have been competitors. OpenCL, DirectX come to mind.

replies(1): >>39347824 #

38. AYBABTME ◴[12 Feb 24 17:14 UTC] No.39347469{4}[source]▶

>>39347110 #

You would think that by now AMD realizes that poor software is what left them behind in the dust, and would have changed that mindset.

replies(1): >>39347735 #

39. p_l ◴[12 Feb 24 17:16 UTC] No.39347497{5}[source]▶

>>39347244 #

I've seen rumours from self-claimed ex-Lotus employees that IBM made a deal with Lotus to prioritise OS/2

40. swozey ◴[12 Feb 24 17:27 UTC] No.39347633[source]▶

>>39346941 #

I just went through this this weekend - If you're running in Windows and want to use deepspeed, you have to still use Cuda 12.1 because deepspeed 13.1 is the latest that works with 12.1. There's no deepspeed for windows that works with 12.3.

I tried to get it working this weekend but it was a huge PITA so I switched to putting everything into WSL2 then in arch on there pytorch etc in containers so I could flip versions easily now that I know how SPECIFIC the versions are to one another.

I'm still working on that part, halfway into it my WSL2 completely broke and I had to reinstall windows. I'm scared to mount the vhdx right now. I did ALL of my work and ALL of my documentation is inside of the WSL2 archlinux and NOT on my windows machine. I have EVERYTHING I need to quickly put another server up (dotfiles, configs) sitting in a chezmoi git repo ON THE VM. That I only git committed one init like 5 mins into everything. THAT was a learning experience, now I have no idea if I should follow the "best practice" of keeping projects in wsl or having wsl reach out to windows, there's a performance drop. The 9p networking stopped working and no matter what I reinstalled, reset, removed features, reset windows, etc, it wouldn't start. But at least I have that WSL2 .vhdx image that will hopefully mount and start. And probably break WSL2 again. I even SPECIFICALLY took backups of the image as tarballs every hour in case I broke LINUX, not WSL.

If anyone has done sd containers in wsl2 already let me know. I've tried to use WSL for dev work (i use osx) like this 2-3 times in the last 4-5 years and I always run into some catastrophically broken thing that makes my WSL stop working. I hadn't used it in years so hoped it was super reliable by now. This is on 3 different desktops with completely different hardware, etc. I was terrified it would break this weekend and IT DID. At least I can be up in windows in 20 minutes thanks to chocolately and chezmoi. Wiped out my entire gaming desktop.

Sorry I'm venting now this was my entire weekend.

This repo is from a deepspeed contrib (iirc) and lists the reqs for deepspeed + windows that mention the version matches

https://github.com/S95Sedan/Deepspeed-Windows

> conda install pytorch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 pytorch-cuda=12.1 -c pytorch -c nvidia

It may sound weird to do any of this in Windows, or maybe not, but if it does just remember that it's a lot of gamers like me with 4090s who just want to learn ML stuff as a hobby. I have absolutely no idea what I'm doing but thank god I know containers and linux like the back of my hand.

replies(3): >>39347855 #>>39362476 #>>39395549 #

41. hyperman1 ◴[12 Feb 24 17:36 UTC] No.39347735{5}[source]▶

>>39347469 #

Most businesses understand the pain points of their suppliers very well, as they feel that pain and gave themselves organized around it.

They have a hard time to understand the pain points of their consumers, as they don't feel that pain, look trough their own organisation-coloured glases, and can't see the real pain points from the whiney-customer ones.

AMD probably thinks software ecosystems are the easy part, ready to take it on whenever they feel like it and throw a token amount at it. They've built a great engine, see the carossery as beneath them, and don't understand why the lazy customer wants them to build the rest of the car too.

42. ethbr1 ◴[12 Feb 24 17:37 UTC] No.39347760[source]▶

>>39346971 #

The problem with open standards is that someone has to write them.

And that someone usually isn't a manufacturer, lest the committee be accused of bias.

Consequently, you get (a) outdated features that SotA has already moved beyond, (b) designed in a way that doesn't correspond to actual practice, and (c) that are overly generalized.

There are some notable exceptions (e.g. IETF), but the general rule has been that open specs please no one, slowly.

IMHO, FRAND and liberal cross-licensing produce better results.

replies(1): >>39347872 #

43. cogman10 ◴[12 Feb 24 17:41 UTC] No.39347824{5}[source]▶

>>39347461 #

SYCL is the latest attempt that I'm aware of. It's still pretty active and may just work as it doesn't rely on video card manufactures to work out.

replies(1): >>39348316 #

44. bick_nyers ◴[12 Feb 24 17:43 UTC] No.39347855{3}[source]▶

>>39347633 #

Vent away! Sounds frustrating for sure.

As much as I love Microsoft/Windows for the work they have put into WSL, I ended up just putting Kubuntu on my devices and use QEMU with GPU passthrough whenever I need Windows. Gaming perf is good. You need an iGPU or a cheap second GPU for Linux in order to hand off a 4090 etc. to Windows (unless maybe your motherboard happens to support headless boot but if it's a consumer board it doesn't). Dual boot with Windows always gave me trouble.

replies(2): >>39348016 #>>39348602 #

45. jchw ◴[12 Feb 24 17:45 UTC] No.39347872{3}[source]▶

>>39347760 #

Vulkan already has some standard compute functionality. Not sure if it's low level enough to be able to e.g. recompile and run CUDA kernels, but I think if people were looking for a vendor-neutral standard to build GPGPU compute features on top of, I mean, that seems to be the obvious modern choice.

replies(1): >>39348062 #

46. jvanderbot ◴[12 Feb 24 17:50 UTC] No.39347964[source]▶

>>39345853 (TP) #

If you replace CUDA -> x86 and NVIDIA -> Intel, you'll see a familiar story which AMD has already proved it can work through.

These were precisely the arguments for 'x86 will entrench Intel for all time', and we've seen AMD succeed at that game just fine.

replies(5): >>39347993 #>>39348224 #>>39348252 #>>39348427 #>>39361222 #

47. ianlevesque ◴[12 Feb 24 17:52 UTC] No.39347993[source]▶

>>39347964 #

And indeed more than succeed, they invented x86_64.

replies(2): >>39348177 #>>39348424 #

48. katbyte ◴[12 Feb 24 17:54 UTC] No.39348016{4}[source]▶

>>39347855 #

I recently gave this a go as I’d not had a windows desktop for a long time, have a beefy Proxmox server and wanted to play some windows only games - works shockingly well with an a4000 and 35m optical hdmi cables! - however I’m getting random audio crackling and popping and I’ve yet to figure out what’s causing it.

First I thought it was hardware related in a Remote Desktop session leading me to think some weird audio driver thing

have you encountered anything like this at all?

replies(1): >>39348652 #

49. zozbot234 ◴[12 Feb 24 17:57 UTC] No.39348062{4}[source]▶

>>39347872 #

There is already a work-in-progress implementation of HIP on top of OpenCL https://github.com/CHIP-SPV/chipStar and the Mesa RustiCL folks are quite interested in getting that to run on top of Vulkan.

(To be clear, HIP is about converting CUDA source code not running CUDA-compiled binaries but the Zluda project discussed in OP heavily relies on it.)

50. carlossouza ◴[12 Feb 24 18:00 UTC] No.39348092[source]▶

>>39346941 #

Great comment.

I bet there are at least two markets (or niches):

1. People who want the absolute best performance and the latest possible version and are willing to pay the premium for it;

2. People who want to trade performance by cost and accept working with not-the-latest versions.

In fact, I bet the market for (2) is much larger than (1).

51. neerajsi ◴[12 Feb 24 18:00 UTC] No.39348097{3}[source]▶

>>39346142 #

I'm not in the gpu programming realm, so this observation might be inaccurate:

I think the case of cuda vs an open standard is different from os2 vs Windows because the customers of cuda are programmers with access to source code while the customers of os2 were end users trying to run apps written by others.

If your shrink-wrapped software didn't run on os2, you'd have no choice but to go buy Windows. Otoh if your ai model doesn't run on an AMD device and the issue is something minor, you can edit the shader code.

52. stcredzero ◴[12 Feb 24 18:07 UTC] No.39348177{3}[source]▶

>>39347993 #

And indeed more than succeed, they invented x86_64.

If AMD invented the analogous to x86_64 for CUDA, this would increase competition and progress in AI by some huge fraction.

replies(2): >>39348847 #>>39351796 #

53. samstave ◴[12 Feb 24 18:10 UTC] No.39348224[source]▶

>>39347964 #

Transmetta was Intels boogey-man in the 90s.

54. ethbr1 ◴[12 Feb 24 18:13 UTC] No.39348252[source]▶

>>39347964 #

> These were precisely the arguments for 'x86 will entrench Intel for all time', and we've seen AMD succeed at that game just fine.

... after a couple decades of legal proceedings and a looming FTC monopoly case convinced Intel to throw in the towel, cross-license, and compete more fairly with AMD.

https://jolt.law.harvard.edu/digest/intel-and-amd-settlement

AMD didn't just magically do it on its own.

55. paulmd ◴[12 Feb 24 18:15 UTC] No.39348271{5}[source]▶

>>39347325 #

> However, Microsoft has definitely been drinking the IBM koolaid a little to long and has lost the mandate of heaven. I think in the next 7-10 years we will reach a point where there is nothing Windows can do that linux cannot do better and easier without spying on you

that's a fascinating statement with the clear ascendancy of neural-assisted algorithms etc. Things like DLSS are the future - small models that just quietly optimize some part of a workload that was commonly considered impossible to the extent nobody even thinks about it anymore.

my prediction is that in 10 years we are looking at the rise of tag+collection based filesystems and operating system paradigms. all of us generate a huge amount of "digital garbage" constantly, and you either sort it out into the important stuff, keep temporarily, and toss, or you accumulate a giant digital garbage pile. AI systems are gonna automate that process, it's gonna start on traditional tree-based systems but eventually you don't need the tree at all, AI is what's going to make that pivot to true tag/collection systems possible.

Tags mostly haven't worked because of a bunch of individual issues which are pretty much solved by AI. Tags aren't specific enough: well, AI can give you good guesses at relevance. Tagging files and maintaining collections is a pain: well, the AI can generate tags and assign collections for you. Tags really require an ontology for "fuzzy" matching (search for "food" should return the tag "hot dog") - well, LLMs understand ontologies fine. Etc etc. And if you do it right, you can basically have the AI generate "inbox/outbox" for you, deduplicate files and handle versioning, etc, all relatively seamlessly.

microsoft and macos are both clearly racing for this with the "AI os" concept. It's not just better relevance searches etc. And the "generate me a whole paragraph before you even know what I'm trying to type" stuff is not how it's going to work either. That stuff is like specular highlights in video games around 2007 or whatever - once you had the tool, for a few years everything was w e t until developers learned some restraint with it. But there are very very good applications that are going to come out in the 10 year window that are going to reduce operator cognitive load by a lot - that is the "AI OS" concept. What would the OS look like if you truly had the "computer is my secretary" idea? Not just dictating memorandums, but assistance in keeping your life in order and keeping you on-task.

I simply cannot see linux being able to keep up with this change, in the same way the kernel can't just switch to rust - at some point you are too calcified to ever do the big-bang rewrite if there is not a BDFL telling you that it's got to happen.

the downside of being "the bazaar" is that you are standards-driven and have to deal with corralling a million whiny nerds constantly complaining about "spying on me just like microsoft" and continuing to push in their own other directions (sysvinit/upstart/systemd factions, etc) and whatever else, on top of all the other technical issues of doing a big-bang rewrite. linux is too calcified to ever pivot away from being a tree-based OS and it's going to be another 2-3 decades before they catch up with "proper support for new file-organization paradigms" etc even in the smaller sense.

that's really just the tip of the iceberg on the things AI is going to change, and linux is probably going to be left out of most of those commercial applications despite being where the research is done. It's just too much of a mess and too many nerdlingers pushing back to ever get anything done. Unix will be represented in this new paradigm but not Linux - the commercial operators who have the centralization and fortitude to build a cathedral will get there much quicker, and that looks like MacOS or Solaris not linux.

Or at least, unless I see some big announcement from KDE or Gnome or Canonical/Red Hat about a big AI-OS rewrite... I assume that's pretty much where the center of gravity is going to stay for linux.

replies(2): >>39349554 #>>39350910 #

56. zozbot234 ◴[12 Feb 24 18:18 UTC] No.39348316{6}[source]▶

>>39347824 #

SYCL is the quasi-successor to OpenCL, built on the same flavor of SPIR-V. Various efforts are trying to run it on top of Vulkan Compute (which tends to be broadly support by modern GPU's) but it's non-trivial because the technologies are independently developed and there are some incompatibilities.

57. mindcrime ◴[12 Feb 24 18:25 UTC] No.39348398[source]▶

>>39345853 (TP) #

Yep. This is very similar to the "catch-22" that IBM wound up in with OS/2 and the Windows API. On the one hand, by supporting Windows software on OS/2, they gave OS/2 customers access to a ready base of available, popular software. But in doing so, they also reduced the incentive for ISV's to produce OS/2 native software that could take advantage of unique features of OS/2.

It's a classic "between a rock and a hard place" scenario. Quite a conundrum.

replies(1): >>39348479 #

58. sangnoir ◴[12 Feb 24 18:27 UTC] No.39348424{3}[source]▶

>>39347993 #

x86_64's win was helped by Intel's Itanium misstep. AMD can't bank on Nvidia making a mistake, and Nvidia seems content with incremental changes to CUDA, contrasted with Intel's 32-bit to 64-bit transition. It is highly unlikely that AMD can find and exploit a similar chink in the amor against CUDA.

replies(1): >>39348767 #

59. clhodapp ◴[12 Feb 24 18:27 UTC] No.39348427[source]▶

>>39347964 #

If that's the model, it sounds like the path would be to burn money to stay right behind NVIDIA and wait for them to become complacent and stumble technically, creating the opportunity to leapfrog them. Keeping up could be very expensive if they don't force something like the mutual licensing requirements around x86.

60. kevin_thibedeau ◴[12 Feb 24 18:28 UTC] No.39348437{3}[source]▶

>>39346836 #

The original cross licensing was government imposed because a second source was needed for the military.

replies(1): >>39348595 #

61. ianlevesque ◴[12 Feb 24 18:31 UTC] No.39348479[source]▶

>>39348398 #

Thinking about the highly adjacent graphics APIs history, did anyone really 'win' the Direct3D, OpenGL, Metal, Vulkan war? Are we benefiting from the fragmentation?

If the players in the space have naturally coalesced around one over the last decade, can we skip the thrashing and just go with it this time?

replies(2): >>39348615 #>>39355263 #

62. atq2119 ◴[12 Feb 24 18:40 UTC] No.39348595{4}[source]▶

>>39348437 #

Makes you wonder why DoE labs and similar facilities don't mandate open licensing of CUDA.

63. swozey ◴[12 Feb 24 18:41 UTC] No.39348602{4}[source]▶

>>39347855 #

Are you flipping from your main GPU to like a GT710 to do the gpu vfio mount? Or can you share the dgpu directly and not have to go headless now?

I've done this on both a hackintosh and void linux. I was so excited to get the hackintosh working because I honestly hate day desktop linux, it's my day job to work on and I just don't want to deal with it after work.

Unfortunately both would break in significant ways and I'd have to trudge through and fix things. I had that void desktop backed up with Duplicacy (duplicati front end) and IIRC I tried to roll back after breaking qemu, it just dumps all your backup files into their dirs, and I think I broke it more.

I think at that point I was back up in Windows in 30 mins.. and all of its intricacies like bsoding 30% of the time that I either restart it or unplug a usb hub. But my Macbooks have a 30% chance of not waking up on Monday morning when I haven't used them all weekend without me having to grab them and open the screen.

64. tadfisher ◴[12 Feb 24 18:41 UTC] No.39348615{3}[source]▶

>>39348479 #

The game engines won. Folks aren't building Direct3D or Vulkan renderers; they're using Unity or Unreal or Godot and clicking "export" to target whatever API makes sense for the platform.

WebGPU might be the thing that unifies the frontend API for folks writing cross-platform renderers, seeing as browsers will have to implement it on top of the platform APIs anyway.

65. swozey ◴[12 Feb 24 18:45 UTC] No.39348652{5}[source]▶

>>39348016 #

What are you running for audio? pipewire+jack, pipewire, jack2, pulseaudio? I wonder if it's from latency. Pulseaudio is the most common but if you do any audio engineering or play guitar etc with your machine we all use jack protocol for less latency.

https://linuxmusicians.com/viewtopic.php?t=25556

Could be completely unrelated though, RDP sessions can definitely act up, get audio out of sync etc. I try to never do pass through rdp audio, it's not even enabled by default in the mstsc client IIRC but that may just be a "probably server" thing.

replies(1): >>39358625 #

66. LamaOfRuin ◴[12 Feb 24 18:55 UTC] No.39348767{4}[source]▶

>>39348424 #

If they're content with incremental changes to CUDA then it doesn't cost much to keep updated compatibility and do it as quickly as any users actually adopt changes.

67. bluedino ◴[12 Feb 24 18:58 UTC] No.39348793[source]▶

>>39346941 #

> How many people are running an older version of CUDA right now on NVIDIA hardware for whatever particular reason?

I would guess there are lots of people still running CUDA 11. Older clusters, etc. A lot of that software doesn't get updated very often.

replies(1): >>39352949 #

68. pjmlp ◴[12 Feb 24 19:03 UTC] No.39348847{4}[source]▶

>>39348177 #

Only works if NVidia misteps and creates the Itanium version of CUDA.

replies(1): >>39348980 #

69. pjmlp ◴[12 Feb 24 19:09 UTC] No.39348906{5}[source]▶

>>39347325 #

There is no competition when games only come to Linux by "emulating" Windows.

The only thing it has going for it is being a free beer UNIX clone for headless environments, and even then, isn't that relevant on cloud environments where containers and managed languages abstract everything they run on.

replies(1): >>39349320 #

70. stcredzero ◴[12 Feb 24 19:15 UTC] No.39348980{5}[source]▶

>>39348847 #

You don't think someone would welcome the option to have more hardware buying options, even if the "Itanium version" didn't happen?

71. BizarroLand ◴[12 Feb 24 19:39 UTC] No.39349320{6}[source]▶

>>39348906 #

Thanks to the Steam Deck, more and more games are being ported for Linux compatibility by default.

Maybe some Microsoft owned games makers will never make the shift, but if the majority of others do then that's the death knell.

replies(2): >>39350874 #>>39351097 #

72. BizarroLand ◴[12 Feb 24 19:55 UTC] No.39349554{6}[source]▶

>>39348271 #

Counterpoint: Most AI stuff is developed on either an OS agnostic language like Python or C, and then ported to Linux/OSX/Windows, so for AI it is less about the OS it runs on than the hardware, drivers, and/or connections that the OS supports.

For the non-vendor lock in AI's (copilot), casting as wide of a net as possible to catch customers as easily as possible should by default mean that they would invest the small amount of money to build linux integrations into their AI platforms.

Plus, the googs has a pretty deep investment into the linux ecosystem and should have little issue pushing bard or gemini or whatever they'll call it next week before they kill it out into a linux compatible interface, and if they do that then the other big players will follow.

And, don't overlook the next generation of VR headsets. People have gotten silly over the Apple headset, but Valve should be rolling out the Deckhard soon and others will start to compete in that space since Apple raised the price bar and should soon start rolling out hardware with more features and software to take advantage of it.

replies(1): >>39353910 #

73. Uehreka ◴[12 Feb 24 20:03 UTC] No.39349668{4}[source]▶

>>39346889 #

> Well, then I guess CUDA is not really the problem

It is. All the things are the problem. AMD is behind on both hardware and software, for both gaming and compute workloads, and has been for many years. Their competitor has them beat in pretty much every vertical, and the lock-in from CUDA helps ensure that even if AMD can get their act together on the hardware side, existing compute workloads (there are oceans of existing workloads) won’t run on their hardware, so it won’t matter for professional or datacenter usage.

To compete with Nvidia in those verticals, AMD has to fix all of it. Ideally they’d come out with something better than CUDA, but they have not shown an aptitude for being able to do something like that. That’s why people keep telling them to just make a compatibility layer. It’s a sad place to be, but that’s the sad place where AMD is, and they have to play the hand they’ve been dealt.

74. Jorropo ◴[12 Feb 24 20:10 UTC] No.39349772{4}[source]▶

>>39346735 #

Indeed I got that wrong. Sadly minimal SIMD and hardware acceleration support.

75. krab ◴[12 Feb 24 20:14 UTC] No.39349815{3}[source]▶

>>39346836 #

Isn't API out of scope for copyright? In the case of CUDA, it seems they can copy most of it and then iterate in their own, keeping a compatible subset.

76. incrudible ◴[12 Feb 24 21:37 UTC] No.39350874{7}[source]▶

>>39349320 #

Are they ported though? I would say thanks to the Steam Deck, Proton is at a point where native Linux ports are unnecessary. It's also a much more stable target to develop against than N+1 Linux distros.

replies(1): >>39352189 #

77. incrudible ◴[12 Feb 24 21:41 UTC] No.39350910{6}[source]▶

>>39348271 #

"Neural assisted algorithms" are just algorithms with large lookup tables. Another magnitude of binary bloat, but that's nothing we haven't experienced before. There's no need to fundamentally change the OS paradigm for it.

replies(1): >>39351693 #

78. pjmlp ◴[12 Feb 24 21:56 UTC] No.39351097{7}[source]▶

>>39349320 #

Nah, everyone is relying on Proton, there are hardly any native GNU/Linux games being ported, not even Android/NDK ones, where SDL, OpenGL, Vulkan, C, C++ are present, and would be extremely easy to port.

replies(1): >>39356173 #

79. paulmd ◴[12 Feb 24 22:55 UTC] No.39351693{7}[source]▶

>>39350910 #

I think we're well past the "dlss is just FSR2 with lookup tables, you can ALWAYS replicate the outcomes of neural algorithms with deterministic ones" phase, imo.

if that's the case you have billion-dollar opportunities waiting for you to prove it!

replies(1): >>39361900 #

80. mqus ◴[12 Feb 24 23:04 UTC] No.39351785[source]▶

>>39345853 (TP) #

If their primary objective is to sell cards, then they should make it as easy as possible to switch cards.

If their primary objective is to break the CUDA monopoly, they should up their game in software, which means going as far as implementing support for their hardware in the most popular user apps themselves, if necessary. But since they don't seem to want to do that, they should really go for option one, especially if a single engineer already got so far.

Let's say AMD sold a lot of cards with CUDA support. Now nvidia tries to cut them off. What will happen next? A lot of people will replace their cards with nvidia ones. But a lot of the rest will try to make their expensive AMD cards work regardless. And if AMD provides a platform for that, they will get that work for free.

81. jvanderbot ◴[12 Feb 24 23:05 UTC] No.39351796{4}[source]▶

>>39348177 #

Why "if only". Intel had been around forever when AMD showed up. CUDA isn't unassailable

82. BizarroLand ◴[12 Feb 24 23:41 UTC] No.39352189{8}[source]▶

>>39350874 #

Many are specifically ported to work with Linux without a wrapper, especially among indie games and games from smaller studios.

Unity, Unreal and Godot all support compiling for Linux either by default or with inexpensive or possibly free add-ons. I'm sure many other game engines do as well, and when you're taking a few hours of work at most to add everyone who owns a steam deck or a steam deck clone as a potential customer to your customer base then that is not a tall order.

replies(1): >>39354912 #

83. coderenegade ◴[13 Feb 24 01:00 UTC] No.39352949{3}[source]▶

>>39348793 #

Especially if you're deploying models. The latest version of onnx runtime still defaults to cuda 11.

84. coderenegade ◴[13 Feb 24 02:59 UTC] No.39353772{5}[source]▶

>>39347325 #

I don't think it'll be a killer app so much as a confluence of different factors. For one thing, we now live in a world where docker is fast becoming as ubiquitous as git, and unlike git, requires a Linux VM to run on Windows. It's also a key technology for the replication and distribution of ML models, which again, are developed on Linux, trained on clusters running Linux, and deployed to servers running Linux. And this is all done in Python, a language native to Linux, which is now one of the most used languages on Earth.

We already see things like Google abandoning tensorflow support for Windows, because they don't have enough devs using Windows to easily maintain it.

And of course, we have a changing of the guard in terms of a generation of software developers who primarily worked on Windows, because that was the way to do it, starting to retire. Younger devs came up in the Google era where Linux is a first class citizen alongside MacOS.

I think these factors are going to change the face of technology in the coming 15 years, and that's likely to affect how businesses and consumers consume technology, even if they don't understand what's actually running under the hood.

85. coderenegade ◴[13 Feb 24 03:18 UTC] No.39353910{7}[source]▶

>>39349554 #

Most AI dev that I've seen tends to be done on Linux or MacOS first. Certainly the research and training are, because HPC tends to be Linux. And of course, the models are deployed in containers, a Linux technology, to webservers running Linux.

MS has put a collosal amount money into catching up to at least be able to take advantage of the AI wave, that much is clear. Maybe for consumers this will be enough, but R&D wise I don't see them ever being the default choice.

And this is potentially a huge problem for them in the long run, because OS choice by industry is driven by the available tooling. If they lose ML, they could potentially lose traditional engineering if fields like robotics start relying on Linux more heavily.

86. pjmlp ◴[13 Feb 24 05:53 UTC] No.39354912{9}[source]▶

>>39352189 #

They do, yet you will hardly find a big name Studio that will waste additional money doing builds, QA and customer support for GNU/Linux, just let Valve do the needful with Proton.

87. pjmlp ◴[13 Feb 24 06:54 UTC] No.39355263{3}[source]▶

>>39348479 #

You missed the Nintendo and Sony APIs as well.

FOSS folks make this a bigger issue than it really is, game studios make a pluggable API on their engine and call it a day, move on into everything else that matters in actually delivering a game.

88. Qwertious ◴[13 Feb 24 10:03 UTC] No.39356173{8}[source]▶

>>39351097 #

>Nah, everyone is relying on Proton, there are hardly any native GNU/Linux games being ported

This doesn't make the "play" button any different. People only care if the Proton version is buggy or noticeably less performant, and native ports have no trouble being both of those (see: Rust (game) before the devs dropped Linux support)

replies(1): >>39357056 #

89. Qwertious ◴[13 Feb 24 10:14 UTC] No.39356235{4}[source]▶

>>39346889 #

>so being able to run CUDA on AMD hardware wouldn't solve anything.

It limits Nvidia's profit margin - if Nvidia cards run twice as fast but cost more than twice as much, then people will just buy two AMD cards. Meanwhile, it gives AMD some revenue with which to fund an improved CUDA stack.

>their customers have the option to save money by writing ROCm

CUDA saves money by having a fuckton of pre-written CUDA code and being supported as default basically everywhere.

90. pjmlp ◴[13 Feb 24 12:38 UTC] No.39357056{9}[source]▶

>>39356173 #

It worked really well for OS/2.

91. katbyte ◴[13 Feb 24 15:42 UTC] No.39358625{6}[source]▶

>>39348652 #

I have tried Optical usb cable to kvm to dac, audio over hdmi, and audio over rdp. All have the same crackle

replies(1): >>39360488 #

92. swozey ◴[13 Feb 24 17:51 UTC] No.39360488{7}[source]▶

>>39358625 #

Oh it's every single rpd connection? That's definitely not normal to rdp at all. I used to be a windows engineer so I RDP'd a LOT. RDP was our ssh, lol.

Crackle would happen so rarely that I KNOW it definitely happened but it wasn't like a 2 day thing it was probably like, once in a year or 6 months, etc.

93. HarHarVeryFunny ◴[13 Feb 24 18:49 UTC] No.39361222[source]▶

>>39347964 #

The difference is that AMD's CPUs are designed to implement the x86 and x86-64 ISA, so there is no loss of performance. In contrast, AMD and NVIDA's GPU instruction sets and architectures are not the same, and to get top performance out of these architectures code needs to be customized for them.

If you slap a CUDA compatibility layer on top of AMD, then CUDA code optimized for NVIDIA chips would run, but would suffer a performance penalty compared to code that was customized/tuned for AMD, so unless AMD GPUs were sold cheap enough (i.e. with low profit margin) to mitigate this loss of performance you might as well buy NVIDIA in the first place.

94. incrudible ◴[13 Feb 24 19:42 UTC] No.39361900{8}[source]▶

>>39351693 #

Floating point inaccuracies and random seeds aside, something like DLSS is entirely deterministic. It is just a bunch of matrix multiplications.

replies(1): >>39407619 #

95. andrecarini ◴[13 Feb 24 20:34 UTC] No.39362476{3}[source]▶

>>39347633 #

> I'm scared to mount the vhdx right now [...] I have EVERYTHING I need [...] sitting in a chezmoi git repo ON THE VM.

You probably already know but just in case you don't: you can set up a Linux VM with VirtualBox on your Windows and then mount the vhdx (read-only) as an additional disk to extract the stuff you need via shared folders.

96. CoolCold ◴[16 Feb 24 11:18 UTC] No.39395549{3}[source]▶

>>39347633 #

> halfway into it my WSL2 completely broke and I had to reinstall windows

I'm curios, so WSL2 broke that you cannot even add new distros, remove broken distros? or Windows host itself became unstable?

97. paulmd ◴[17 Feb 24 08:40 UTC] No.39407619{9}[source]▶

>>39361900 #

You can’t possibly expect me to take your post seriously when there’s not even any true evidence of cognition involved in its writing. Just some meat flopping around spastically from some chemicals pumped up from the gut, and electrical zaps from the nervous system.

We can see that it’s not magic, the neuron either activates or it doesn’t, so why should I pay attention to some probabilistic steam of gibberish it spewed out? There is nothing meaningful that can be inferred from such systems, right?

↑