←back to thread

311 points melodyogonna | 7 comments | | HN request time: 0.027s | source | bottom
Show context
MontyCarloHall ◴[] No.45138920[source]
The reason why Python dominates is that modern ML applications don't exist in a vacuum. They aren't the standalone C/FORTRAN/MATLAB scripts of yore that load in some simple, homogeneous data, crunch some numbers, and spit out a single result. Rather, they are complex applications with functionality extending far beyond the number crunching, which requires a robust preexisting software ecosystem.

For example, a modern ML application might need an ETL pipeline to load and harmonize data of various types (text, images, video, etc., all in different formats) from various sources (local filesystem, cloud storage, HTTP, etc.) The actual computation then must leverage many different high-level functionalities, e.g. signal/image processing, optimization, statistics, etc. All of this computation might be too big for one machine, and so the application must dispatch jobs to a compute cluster or cloud. Finally, the end results might require sophisticated visualization and organization, with a GUI and database.

There is no single language with a rich enough ecosystem that can provide literally all of the aforementioned functionality besides Python. Python's numerical computing libraries (NumPy/PyTorch/JAX etc.) all call out to C/C++/FORTRAN under the hood and are thus extremely high-performance, and for functionality they don't implement, Python's C/C++ FFIs (e.g. Python.h, NumPy C integration, PyTorch/Boost C++ integration) are not perfect, but are good enough that implementing the performance-critical portions of code in C/C++ is much easier compared to re-implementing entire ecosystems of packages in another language like Julia.

replies(8): >>45139364 #>>45140601 #>>45141802 #>>45143317 #>>45144664 #>>45146179 #>>45146608 #>>45146905 #
benreesman ◴[] No.45144664[source]
I'm in kind of a different place with it on the inference side.

I've got these crazy tuned up CUDA kernels that are relatively straightforward to build in isolation and really where all the magic happens, and there's this new CUTLASS 3 stuff and modern C++ can call it all trivially.

And then there's this increasingly thin film of torch crap that's just this side of unbuildable and drags in this reference counting and broken setup.py and it's a bunch of up and down projections to the real hacker shit.

I'm thinking I'm about one squeeze of the toothpaste tube from just squuezing that junk out and having a nice, clean, well-groomed C++ program that can link anything and link into anything.

replies(1): >>45146909 #
1. pjmlp ◴[] No.45146909[source]
CUTLASS 4 has first class support for Python.
replies(1): >>45147770 #
2. saagarjha ◴[] No.45147770[source]
In fact I doubt the C++ API will be getting much love moving forward
replies(1): >>45147855 #
3. pjmlp ◴[] No.45147855[source]
At GTC 2025, NVidia introduced two major changes in CUDA ecosystem.

First class support for Python JIT/DSLs across the whole ecosystem.

Change the way C++ is used and taught, more focused on standard C++ support and libraries, than low level CUDA extensions.

So in a way, I think you're kind of right.

replies(1): >>45149470 #
4. benreesman ◴[] No.45149470{3}[source]
Nah, their people are way involved in mdarray and ROCm is looking to have the "oh no its broken again" bit flipped off in the RDNA 4/5 cycle.

NVIDIA wants Python and C++ people, they want a new thing to moat up on, and they know it has to be legitimately good to defy economic gravity on chips a lot of companies can design and fan now.

replies(1): >>45149634 #
5. pjmlp ◴[] No.45149634{4}[source]
Intel and AMD don't have anyone but themselves to blame.

And Khronos always expecting the community to do the work for their standards, filling in the missing pieces for actually excellence in developer experience.

replies(2): >>45150267 #>>45154804 #
6. benreesman ◴[] No.45150267{5}[source]
Well I blame our legislators, our regulators, and a public who tolerates low integrity, low competence leadership. A society gets what it pays for with the standards it sets for its leaders.

Out of a great many outcomes a very topical one today is semiconductors, and the outcomes rival any on anything for corrupt, incompetent, and entirely consistent with our speed run of the road to irrelevance.

7. bigyabai ◴[] No.45154804{5}[source]
You don't seem to know what Khronos even is, at this point. It is a nonprofit consortium, joining the group for the community effort is the only reason it exists.

By blaming Khronos you're really just reiterating blame on Intel and AMD with the tacit inclusion of Apple. I suppose you could blame Nvidia for not giving away their IP, but they were a staunch OpenCL supporter from the start.