ML needs a new programming language – Interview with Chris Lattner

(signalsandthreads.com)

311 points melodyogonna | 2 comments | 05 Sep 25 11:33 UTC | HN request time: 0s | source

Show context

MontyCarloHall ◴[05 Sep 25 14:19 UTC] No.45138920[source]▶

The reason why Python dominates is that modern ML applications don't exist in a vacuum. They aren't the standalone C/FORTRAN/MATLAB scripts of yore that load in some simple, homogeneous data, crunch some numbers, and spit out a single result. Rather, they are complex applications with functionality extending far beyond the number crunching, which requires a robust preexisting software ecosystem.

For example, a modern ML application might need an ETL pipeline to load and harmonize data of various types (text, images, video, etc., all in different formats) from various sources (local filesystem, cloud storage, HTTP, etc.) The actual computation then must leverage many different high-level functionalities, e.g. signal/image processing, optimization, statistics, etc. All of this computation might be too big for one machine, and so the application must dispatch jobs to a compute cluster or cloud. Finally, the end results might require sophisticated visualization and organization, with a GUI and database.

There is no single language with a rich enough ecosystem that can provide literally all of the aforementioned functionality besides Python. Python's numerical computing libraries (NumPy/PyTorch/JAX etc.) all call out to C/C++/FORTRAN under the hood and are thus extremely high-performance, and for functionality they don't implement, Python's C/C++ FFIs (e.g. Python.h, NumPy C integration, PyTorch/Boost C++ integration) are not perfect, but are good enough that implementing the performance-critical portions of code in C/C++ is much easier compared to re-implementing entire ecosystems of packages in another language like Julia.

replies(8): >>45139364 #>>45140601 #>>45141802 #>>45143317 #>>45144664 #>>45146179 #>>45146608 #>>45146905 #

Hizonner ◴[05 Sep 25 15:00 UTC] No.45139364[source]▶

>>45138920 #

This guy is worried about GPU kernels, which are never, ever written in Python. As you point out, Python is a glue language for ML.

> There is no single language with a rich enough ecosystem that can provide literally all of the aforementioned functionality besides Python.

That may be true, but some of us are still bitter that all that grew up around an at-least-averagely-annoying language rather than something nicer.

replies(5): >>45139454 #>>45140625 #>>45141909 #>>45142782 #>>45147478 #

ModernMech ◴[05 Sep 25 16:43 UTC] No.45140625[source]▶

>>45139364 #

> This guy is worried about GPU kernels, which are never, ever written in Python. As you point out, Python is a glue language for ML.

That's kind of the point of Mojo, they're trying to solve the so-called "two language problem" in this space. Why should you need two languages to write your glue code and kernel code? Why can't there be a language which is both as easy to write as Python, but can still express GPU kernels for ML applications? That's what Mojo is trying to be through clever use of LLVM MLIR.

replies(5): >>45141705 #>>45143663 #>>45144593 #>>45145100 #>>45145290 #

bjourne ◴[06 Sep 25 00:22 UTC] No.45145290{3}[source]▶

>>45140625 #

> Why can't there be a language which is both as easy to write as Python, but can still express GPU kernels for ML applications? That's what Mojo is trying to be through clever use of LLVM MLIR.

It already exists. It is called PyTorch/JAX/TensorFlow. These frameworks already contain sophisticated compilers for turning computational graphs into optimized GPU code. I dare say that they don't leave enough performance on the table for a completely new language to be viable.

replies(2): >>45147094 #>>45147780 #

davidatbu ◴[06 Sep 25 06:26 UTC] No.45147094{4}[source]▶

>>45145290 #

Last I checked , all of pytorch, tensorflow, and Jax sit at a layer of abstraction that is above GPU kernels. They avail GPU kernels (as basically nodes in the computational graph you mention), but they don't let you write GPU kernels.

Triton, CUDA, etc, let one write GPU kernels.

replies(2): >>45148597 #>>45148632 #

bjourne ◴[06 Sep 25 12:13 UTC] No.45148597{5}[source]▶

>>45147094 #

Yes, they kinda do. The computational graph you specify is completely different from the execution schedule it is compiled into. Whether it's 1, 2, or N kernels is irrelevant as long as it runs fast. Mojo being an HLL is conceptually no different from Python. Whether it will, in the future, become better for DNNs, time will tell.

replies(1): >>45148792 #

1. davidatbu ◴[06 Sep 25 12:42 UTC] No.45148792{6}[source]▶

>>45148597 #

I assume HLL=Higher Level Language? Mojo definitely avails lower-level facilities than Python. Chris has even described Mojo as "syntactic sugar over MLIR". (For example, the native integer type is defined in library code as a struct).

> Whether it's 1, 2, or N kernels is irrelevant.

Not sure what you mean here. But new kernels are written all the time (flash-attn is a great example). One can't do that in plain Python. E.g., flash-attn was originally written in C++ CUDA, and now in Triton.

replies(1): >>45152799 #

2. bjourne ◴[06 Sep 25 21:00 UTC] No.45152799[source]▶

>>45148792 (TP) #

Well, Mojo hasn't been released so we can't precisely say what it can and can't do. If it can emit CUDA code then it does it by transpiling Mojo into CUDA. And there is no reason why Python can't also be transpiled to CUDA.

What I mean here is that DNN code is written on a much higher level than kernels. They are just building blocks you use to instantiate your dataflow.

↑