TPDE-LLVM: Faster LLVM -O0 Back-End

(discourse.llvm.org)

164 points mpweiher | 1 comments | 30 Aug 25 06:55 UTC | HN request time: 0.376s | source

Show context

testdelacc1 ◴[03 Sep 25 08:10 UTC] No.45113380[source]▶

LLVM is the code generation backend used in several languages, like Rust and one of the many compilers for C and C++ (clang). Code generated by these compilers is considered “fast/performant” thanks to LLVM.

The problem with LLVM has always been that it takes a long time to produce code. The post in the link promises a new backend that produces a slower artifact, but does so 10-20x quicker. This is great for debug builds.

This doesn’t mean the compilation as a whole gets quicker. There are 3 steps in compilation

- Front end: transforms source code into an LLVM intermediation representation (IR)

- Backend: this is where LLVM comes in. It accepts LLVM IR and transforms it into machine code

- Linking: a separate program links the artifacts produced by LLVM.

How long does each step take? Really depends on the program we’re trying to compile. This blog post contains timings for one example program (https://blog.rust-lang.org/2023/11/09/parallel-rustc/) to give you an idea. It also depends on whether LLVM is asked to produce a debug build (not performant, but quicker to produce) or a release build (fully optimised, takes longer).

The 10-20x improvement described here doesn’t work yet for clang or rustc, and when it does it will only speed up the backend portion. Nevertheless, this is still an incredible win for compile times because the other two steps can be optimised independently. Great work by everyone involved.

replies(3): >>45113440 #>>45113555 #>>45113671 #

tialaramex ◴[03 Sep 25 08:23 UTC] No.45113440[source]▶

>>45113380 #

IMO the worst problem with LLVM isn't that it's slow, the worst problem is that its IR has poorly defined semantics or its team doesn't actually deliver those semantics and a bug ticket saying "Hey, what gives?" goes in the pile of never-never tickets, making it less useful as a compiler backend even if it was instant.

This is the old "correctness versus performance" problem and we already know that "faster but wrong" isn't meaningfully faster it's just wrong, anybody can give a wrong answer immediately and so that's not at all useful.

replies(2): >>45113616 #>>45119553 #

randomNumber7 ◴[03 Sep 25 08:55 UTC] No.45113616[source]▶

>>45113440 #

What is the alternative though for a new language though? Transpiring to C or hacking s.th. by using the GCC backend?

replies(3): >>45113660 #>>45113723 #>>45113897 #

pjmlp ◴[03 Sep 25 09:40 UTC] No.45113897[source]▶

>>45113616 #

Produce a dumb machine code quality, enough to bootstrapt it, and go from there.

Move away from classical UNIX compiler pipelines.

However in current times, I would rather invest into LLM improvements into generating executables directly, the time to mix AI into compiler development has come, and classical programming languages are just like doing yet another UNIX clone, in terms of value.

replies(1): >>45114370 #

taminka ◴[03 Sep 25 11:04 UTC] No.45114370[source]▶

>>45113897 #

mm, a non deterministic compiler with no way to verify correctness, what could go wrong lol

replies(1): >>45115740 #

pjmlp ◴[03 Sep 25 13:46 UTC] No.45115740[source]▶

>>45114370 #

Ask C and C++ developers, they are used to it, and still plenty of critical software keeps being written with them.

replies(2): >>45116969 #>>45119546 #

1. saagarjha ◴[03 Sep 25 19:25 UTC] No.45119546[source]▶

>>45115740 #

Excellent point, undefined behavior is exactly like an LLM. Surely this is what “alignment” in the standard is talking about.

↑