Apple's MLX adding CUDA support

(github.com)

548 points nsagent | 1 comments | 14 Jul 25 21:40 UTC | HN request time: 0.21s | source

Show context

lukev ◴[15 Jul 25 02:18 UTC] No.44567263[source]▶

>>44565668 (OP) #

So to make sure I understand, this would mean:

1. Programs built against MLX -> Can take advantage of CUDA-enabled chips

but not:

2. CUDA programs -> Can now run on Apple Silicon.

Because the #2 would be a copyright violation (specifically with respect to NVidia's famous moat).

Is this correct?

replies(9): >>44567309 #>>44567350 #>>44567355 #>>44567600 #>>44567699 #>>44568060 #>>44568194 #>>44570427 #>>44577999 #

saagarjha ◴[15 Jul 25 02:28 UTC] No.44567309[source]▶

>>44567263 #

No, it's because doing 2 would be substantially harder.

replies(2): >>44567356 #>>44567414 #

hangonhn ◴[15 Jul 25 02:48 UTC] No.44567414[source]▶

>>44567309 #

Is CUDA tied very closely to the Nvidia hardware and architecture so that all the abstraction would not make sense on other platforms? I know very little about hardware and low level software.

Thanks

replies(4): >>44567469 #>>44567535 #>>44568191 #>>44568597 #

1. lcnielsen ◴[15 Jul 25 05:48 UTC] No.44568191[source]▶

>>44567414 #

The kind of CUDA you or I would write is not very hardware specific (a few constants here and there) but the kind of CUDA behind cuBLAS with a million magic flags, inline PTX ("GPU assembly") and exploitation of driver/firmware hacks is. It's like the difference between numerics code in C and and numerics code in C with tons of in-line assembly code for each one of a number of specific processors.

You can see similar things if you buy datacenter-grade CPUs from AMD or Intel and compare their per-model optimized BLAS builds and compilers to using OpenBLAS or swapping them around. The difference is not world ending but you can see maybe 50% in some cases.

↑