←back to thread

548 points nsagent | 1 comments | | HN request time: 0.219s | source
Show context
lukev ◴[] No.44567263[source]
So to make sure I understand, this would mean:

1. Programs built against MLX -> Can take advantage of CUDA-enabled chips

but not:

2. CUDA programs -> Can now run on Apple Silicon.

Because the #2 would be a copyright violation (specifically with respect to NVidia's famous moat).

Is this correct?

replies(9): >>44567309 #>>44567350 #>>44567355 #>>44567600 #>>44567699 #>>44568060 #>>44568194 #>>44570427 #>>44577999 #
sitkack ◴[] No.44568194[source]
#2 is not a copyright violation. You can reimplement APIs.
replies(2): >>44568364 #>>44568387 #
adastra22 ◴[] No.44568387[source]
CUDA is not an API, it is a set of libraries written by NVIDIA. You'd have to reimplement those libraries, and for people to care at all you'd have to reimplement the optimizations in those libraries. That does get into various IP issues.
replies(3): >>44568506 #>>44568575 #>>44570953 #
pjmlp ◴[] No.44568575[source]
CUDA is neither an API, nor a set of libraries, people get this wrong all the time.

CUDA is an ecosystem of programming languages, libraries and developer tools.

Composed by compilers for C, C++, Fortran, Python JIT DSLs, provided by NVidia, plus several others with either PTX or NVVM IR.

The libraries, which you correctly point out.

And then the IDE integrations, the GPU debugger that is on par with Visual Studio like debugging, profiler,...

Hence why everyone that focus on copying only CUDA C, or CUDA C++, without everything else that makes CUDA relevant keeps failing.

replies(1): >>44572506 #
CamperBob2 ◴[] No.44572506[source]
Only the runtime components matter, though. Nobody cares about the dev tools beyond the core compiler. What people want is to be able to recompile and run on competitive hardware, and I don't understand why that's such an intractable problem.
replies(4): >>44573430 #>>44573665 #>>44573734 #>>44579484 #
1. int_19h ◴[] No.44573734[source]
It's the same essential problem as with e.g. Wine - if you're trying to reimplement someone else's constantly evolving API with a closed-source implementation, it takes a lot of effort just to barely keep up.

As far as portability, people who care about that already have the option of using higher-level APIs that have CUDA backend among several others. The main reason why you'd want to do CUDA directly is to squeeze that last bit of performance out of the hardware, but that is also precisely the area where deviation in small details starts to matter a lot.