Most active commenters
  • pjmlp(8)
  • taylorallred(7)
  • maccard(6)
  • benreesman(5)
  • steveklabnik(4)
  • estebank(4)
  • ChadNauseam(3)
  • duped(3)
  • vbezhenar(3)
  • ben-schaaf(3)

←back to thread

302 points Bogdanp | 139 comments | | HN request time: 1.401s | source | bottom
1. taylorallred ◴[] No.44390996[source]
So there's this guy you may have heard of called Ryan Fleury who makes the RAD debugger for Epic. The whole thing is made with 278k lines of C and is built as a unity build (all the code is included into one file that is compiled as a single translation unit). On a decent windows machine it takes 1.5 seconds to do a clean compile. This seems like a clear case-study that compilation can be incredibly fast and makes me wonder why other languages like Rust and Swift can't just do something similar to achieve similar speeds.
replies(18): >>44391046 #>>44391066 #>>44391100 #>>44391170 #>>44391214 #>>44391359 #>>44391671 #>>44391740 #>>44393057 #>>44393294 #>>44393629 #>>44394710 #>>44395044 #>>44395135 #>>44395226 #>>44395485 #>>44396044 #>>44401496 #
2. js2 ◴[] No.44391046[source]
"Just". Probably because there's a lot of complexity you're waving away. Almost nothing is ever simple as "just".
replies(2): >>44391209 #>>44391231 #
3. tptacek ◴[] No.44391066[source]
I don't think it's interesting to observe that C code can be compiled quickly (so can Go, a language designed specifically for fast compilation). It's not a problem intrinsic to compilation; the interesting hard problem is to make Rust's semantics compile quickly. This is a FAQ on the Rust website.
4. lordofgibbons ◴[] No.44391100[source]
The more your compiler does for you at build time, the longer it will take to build, it's that simple.

Go has sub-second build times even on massive code-bases. Why? because it doesn't do a lot at build time. It has a simple module system, (relatively) simple type system, and leaves a whole bunch of stuff be handled by the GC at runtime. It's great for its intended use case.

When you have things like macros, advanced type systems, and want robustness guarantees at build time.. then you have to pay for that.

replies(9): >>44391549 #>>44391582 #>>44391630 #>>44391910 #>>44394240 #>>44395833 #>>44397304 #>>44401934 #>>44402705 #
5. dhosek ◴[] No.44391170[source]
Because Russt and Swift are doing much more work than a C compiler would? The analysis necessary for the borrow checker is not free, likewise with a lot of other compile-time checks in both languages. C can be fast because it effectively does no compile-time checking of things beyond basic syntax so you can call foo(char) with foo(int) and other unholy things.
replies(5): >>44391210 #>>44391240 #>>44391254 #>>44391268 #>>44391426 #
6. pixelpoet ◴[] No.44391209[source]
At a previous company, we had a rule: whoever says "just" gets to implement it :)
replies(1): >>44391945 #
7. drivebyhooting ◴[] No.44391210[source]
That’s not a good example. Foo(int) is analyzed by compiler and a type conversion is inserted. The language spec might be bad, but this isn’t letting the compiler cut corners.
8. Aurornis ◴[] No.44391214[source]
> makes me wonder why other languages like Rust and Swift can't just do something similar to achieve similar speeds.

One of the primary features of Rust is the extensive compile-time checking. Monomorphization is also a complex operation, which is not exclusive to Rust.

C compile times should be very fast because it's a relatively low-level language.

On the grand scale of programming languages and their compile-time complexity, C code is closer to assembly language than modern languages like Rust or Swift.

9. taylorallred ◴[] No.44391231[source]
That "just" was too flippant. My bad. What I meant to convey is "hey, there's some fast compiling going on here and it wasn't that hard to pull off. Can we at least take a look at why that is and maybe do the same thing?".
replies(1): >>44391246 #
10. steveklabnik ◴[] No.44391240[source]
The borrow checker is usually a blip on the overall graph of compilation time.

The overall principle is sound though: it's true that doing some work is more than doing no work. But the borrow checker and other safety checks are not the root of compile time performance in Rust.

replies(1): >>44392271 #
11. steveklabnik ◴[] No.44391246{3}[source]
> "hey, there's some fast compiling going on here and it wasn't that hard to pull off. Can we at least take a look at why that is and maybe do the same thing?".

Do you really believe that nobody over the course of Rust's lifetime has ever taken a look at C compilers and thought about if techniques they use could apply to the Rust compiler?

replies(1): >>44391276 #
12. taylorallred ◴[] No.44391254[source]
These languages do more at compile time, yes. However, I learned from Ryan's discord server that he did a unity build in a C++ codebase and got similar results (just a few seconds slower than the C code). Also, you could see in the article that most of the time was being spent in LLVM and linking. With a unity build, you nearly cut out link step entirely. Rust and Swift do some sophisticated things (hinley-milner, generics, etc.) but I have my doubts that those things cause the most slowdown.
13. Thiez ◴[] No.44391268[source]
This explanation gets repeated over and over again in discussions about the speed of the Rust compiler, but apart from rare pathological cases, the majority of time in a release build is not spent doing compile-time checks, but in LLVM. Rust has zero-cost abstractions, but the zero-cost refers to runtime, sadly there's a lot of junk generated at compile-time that LLVM has to work to remove. Which is does, very well, but at cost of slower compilation.
replies(1): >>44391818 #
14. taylorallred ◴[] No.44391276{4}[source]
Of course not. But it wouldn't surprise me if nobody thought to use a unity build. (Maybe they did. Idk. I'm curious).
replies(2): >>44391368 #>>44391428 #
15. ceronman ◴[] No.44391359[source]
I bet that if you take those 278k lines of code and rewrite them in simple Rust, without using generics, or macros, and using a single crate, without dependencies, you could achieve very similar compile times. The Rust compiler can be very fast if the code is simple. It's when you have dependencies and heavy abstractions (macros, generics, traits, deep dependency trees) that things become slow.
replies(2): >>44391384 #>>44392083 #
16. steveklabnik ◴[] No.44391368{5}[source]
Rust and C have differences around compilation units: Rust's already tend to be much larger than C on average, because the entire crate (aka tree of modules) is the compilation unit in Rust, as opposed to the file-based (okay not if you're on some weird architecture) compilation unit of C.

Unity builds are useful for C programs because they tend to reduce header processing overhead, whereas Rust does not have the preprocessor or header files at all.

They also can help with reducing the number of object files (down to one from many), so that the linker has less work to do, this is already sort of done (though not to literally one) due to what I mentioned above.

In general, the conventional advice is to do the exact opposite: breaking large Rust projects into more, smaller compilation units can help do less "spurious" rebuilding, so smaller changes have less overall impact.

Basically, Rust's compile time issues lie elsewhere.

17. 90s_dev ◴[] No.44391384[source]
I can't help but think the borrow checker alone would slow this down by at least 1 or 2 orders of magnitude.
replies(3): >>44391473 #>>44391548 #>>44391760 #
18. jvanderbot ◴[] No.44391426[source]
If you'd like the rust compiler to operate quickly:

* Make no nested types - these slow compiler time a lot

* Include no crates, or ones that emphasize compiler speed

C is still v. fast though. That's why I love it (and Rust).

replies(1): >>44394947 #
19. ameliaquining ◴[] No.44391428{5}[source]
Can you explain why a unity build would help? Conventional wisdom is that Rust compilation is slow in part because it has too few translation units (one per crate, plus codegen units which only sometimes work), not too many.
20. steveklabnik ◴[] No.44391473{3}[source]
Your intuition would be wrong: the borrow checker does not take much time at all.
21. FridgeSeal ◴[] No.44391548{3}[source]
Again, as this been often repeated, and backed up with data, the borrow-checker is a tiny fraction of a Rust apps build time, the biggest chunk of time is spent in LLVM.
22. ChadNauseam ◴[] No.44391549[source]
That the type system is responsible for rust's slow builds is a common and enduring myth. `cargo check` (which just does typechecking) is actually usually pretty fast. Most of the build time is spent in the code generation phase. Some macros do cause problems as you mention, since the code that contains the macro must be compiled before the code that uses it, so they reduce parallelism.
replies(3): >>44391716 #>>44392132 #>>44397412 #
23. duped ◴[] No.44391582[source]
I think this is mostly a myth. If you look at Rust compiler benchmarks, while typechecking isn't _free_ it's also not the bottleneck.

A big reason that amalgamation builds of C and C++ can absolutely fly is because they aren't reparsing headers and generating exactly one object file so the linker has no work to do.

Once you add static linking to the toolchain (in all of its forms) things get really fucking slow.

Codegen is also a problem. Rust tends to generate a lot more code than C or C++, so while the compiler is done doing most of its typechecking work, the backend and assembler has a lot of things to chuck through.

replies(6): >>44392553 #>>44392826 #>>44394891 #>>44396127 #>>44396258 #>>44396355 #
24. cogman10 ◴[] No.44391630[source]
Yes but I'd also add that Go specifically does not optimize well.

The compiler is optimized for compilation speed, not runtime performance. Generally speaking, it does well enough. Especially because it's usecase is often applications where "good enough" is good enough (IE, IO heavy applications).

You can see that with "gccgo". Slower to compile, faster to run.

replies(2): >>44392046 #>>44399907 #
25. maxk42 ◴[] No.44391671[source]
Rust is doing a lot more under the hood. C doesn't track variable lifetimes, ownership, types, generics, handle dependency management, or handle compile-time execution (beyond the limited language that is the pre-compiler). The rust compiler also makes intelligent (scary intelligent!) suggestions when you've made a mistake: it needs a lot of context to be able to do that.

The rust compiler is actually pretty fast for all the work it's doing. It's just an absolutely insane amount of additional work. You shouldn't expect it to compile as fast as C.

26. rstuart4133 ◴[] No.44391716{3}[source]
> Most of the build time is spent in the code generation phase.

I can believe that, but even so it's caused by the type system monomorphising everything. When it use qsort from libc, you are using per-compiled code from a library. When you use slice::sort(), you get custom assembler compiled to suit your application. Thus, there is a lot more code generation going on, and that is caused by the tradeoffs they've made with the type system.

Rusts approach give you all sorts of advantages, like fast code and strong compile time type checking. But it comes with warts too, like fat binaries, and a bug in slice::sort() can't be fixed by just shipping of the std dynamic library, because there is no such library. It's been recompiled, just for you.

FWIW, modern C++ (like boost) that places everything in templates in .h files suffers from the same problem. If Swift suffers from it too, I'd wager it's the same cause.

replies(1): >>44395255 #
27. vbezhenar ◴[] No.44391740[source]
I encountered one project in 2000-th with few dozens of KLoC in C++. It compiled in a fraction of a second on old computer. My hello world code with Boost took few seconds to compile. So it's not just about language, it's about structuring your code and using features with heavy compilation cost. I'm pretty sure that you can write Doom with C macros and it won't be fast. I'm also pretty sure, that you can write Rust code in a way to compile very fast.
replies(2): >>44391871 #>>44394831 #
28. tomjakubowski ◴[] No.44391760{3}[source]
The borrow checker is really not that expensive. On a random example, a release build of the regex crate, I see <1% of time spent in borrowck. >80% is spent in codegen and LLVM.
29. vbezhenar ◴[] No.44391818{3}[source]
Is it possible to generate less junk? Sounds like compiler developers took a shortcuts, which could be improved over time.
replies(3): >>44392001 #>>44392115 #>>44394849 #
30. taylorallred ◴[] No.44391871[source]
I'd be very interested to see a list of features/patterns and the cost that they incur on the compiler. Ideally, people should be able to use the whole language without having to wait so long for the result.
replies(3): >>44391972 #>>44392259 #>>44394858 #
31. Zardoz84 ◴[] No.44391910[source]
Dlang compilers does more than any C++ compiler (metaprogramming, a better template system and compile time execution) and it's hugely faster. Language syntax design has a role here.
32. forrestthewoods ◴[] No.44391945{3}[source]
I wanted to ban “just” but your rule is better. Brilliant.
33. kccqzy ◴[] No.44391972{3}[source]
Templates as one single feature can be hugely variable. Its effect on compilation time can be unmeasurable. Or you can easily write a few dozen lines that will take hours to compile.
replies(1): >>44402984 #
34. rcxdude ◴[] No.44392001{4}[source]
Probably, but it's the kind of thing that needs a lot of fairly significant overhauls in the compiler architecture to really move the needle on, as far as I understand.
35. cherryteastain ◴[] No.44392046{3}[source]
Is gccgo really faster? Last time I looked it looked like it was abandoned (stuck at go 1.18, had no generics support) and was not really faster than the "actual" compiler.
replies(1): >>44393453 #
36. taylorallred ◴[] No.44392083[source]
I'm curious about that point you made about dependencies. This Rust project (https://github.com/microsoft/edit) is made with essentially no dependencies, is 17,426 lines of code, and on an M4 Max it compiles in 1.83s debug and 5.40s release. The code seems pretty simple as well. Edit: Note also that this is 10k more lines than the OP's project. This certainly makes those deps suspicious.
replies(1): >>44392867 #
37. zozbot234 ◴[] No.44392115{4}[source]
You can address the junk problem manually by having generic functions delegate as much of their work as possible to non-generic or "less" generic functions (Where a "less" generic function is one that depends only on a known subset of type traits, such as size or alignment. Then delegating can help the compiler generate fewer redundant copies of your code, even if it can't avoid code monomorphization altogether.)
replies(1): >>44394609 #
38. tedunangst ◴[] No.44392132{3}[source]
I just ran cargo check on nushell, and it took a minute and a half. I didn't time how long it took to compile, maybe five minutes earlier today? So I would call it faster, but still not fast.

I was all excited to conduct the "cargo check; mrustc; cc" is 100x faster experiment, but I think at best, the multiple is going to be pretty small.

replies(2): >>44392588 #>>44395154 #
39. vbezhenar ◴[] No.44392259{3}[source]
So there are few distinctive patterns I observed in that project. Please note that many of those patterns are considered anti-patterns by many people, so I don't necessarily suggest to use them.

1. Use pointers and do not include header file for class, if you need pointer to that class. I think that's a pretty established pattern in C++. So if you want to declare pointer to a class in your header, you just write `class SomeClass;` instead of `#include "SomeClass.hpp"`.

2. Do not use STL or IOstreams. That project used only libc and POSIX API. I know that author really hated STL and considered it a huge mistake to be included to the standard language.

3. Avoid generic templates unless absolutely necessary. Templates force you to write your code in header file, so it'll be parsed multiple times for every include, compiled to multiple copies, etc. And even when you use templates, try to split the class to generic and non-generic part, so some code could be moved from header to source. Generally prefer run-time polymorphism to generic compile-time polymorphism.

replies(2): >>44392908 #>>44402983 #
40. kimixa ◴[] No.44392271{3}[source]
While the borrow checker is one big difference, it's certainly not the only thing the rust compiler offers on top of C that takes more work.

Stuff like inserting bounds checking puts more work on the optimization passes and codegen backend as it simply has to deal with more instructions. And that then puts more symbols and larger sections in the input to the linker, slowing that down. Even if the frontend "proves" it's unnecessary that calculation isn't free. Many of those features are related to "safety" due to the goals of the language. I doubt the syntax itself really makes much of a difference as the parser isn't normally high on the profiled times either.

Generally it provides stricter checks that are normally punted to a linter tool in the c/c++ world - and nobody has accused clang-tidy of being fast :P

replies(1): >>44395387 #
41. treyd ◴[] No.44392553{3}[source]
Not only does it generate more code, the initially generated code before optimizations is also often worse. For example, heavy use of iterators means a ton of generics being instantiated and a ton of call code for setting up and tearing down call frames. This gets heavily inlined and flattened out, so in the end it's extremely well-optimized, but it's a lot of work for the compiler. Writing it all out classically with for loops and ifs is possible, but it's harder to read.
replies(1): >>44397082 #
42. ChadNauseam ◴[] No.44392588{4}[source]
Did you do it from a clean build? In that case, it's actually a slightly misleading metric, since rust needs to actually compile macros in order to typecheck code that uses them. (And therefore must also compile all the code that the macro depends on.) My bad for suggesting it, haha. Incremental cargo check is often a better way of seeing how long typechecking takes, since usually you haven't modified any macros that will need to be recompiled. On my project at work, incremental cargo check takes `1.71s`.
replies(2): >>44397210 #>>44402957 #
43. fingerlocks ◴[] No.44392826{3}[source]
The swift compiler is definitely bottle necked by type checking. For example, as a language requirement, generic types are left more or less in-tact after compilation. They are type checked independent of what is happening. This is unlike C++ templates which are effectively copy-pasting the resolved type with the generic for every occurrence of type resolution.

This has tradeoffs: increased ABI stability at the cost of longer compile times.

replies(3): >>44393659 #>>44394911 #>>44397467 #
44. MindSpunk ◴[] No.44392867{3}[source]
The 'essentially no dependencies' isn't entirely true. It depends on the 'windows' crate, which is Microsoft's auto-generated Win32 bindings. The 'windows' crate is huge, and would be leading to hundreds of thousands of LoC being pulled in.

There's some other dependencies in there that are only used when building for test/benchmarking like serde, zstd, and criterion. You would need to be certain you're building only the library and not the test harness to be sure those aren't being built too.

replies(1): >>44396190 #
45. dieortin ◴[] No.44392908{4}[source]
Why use C++ at that point? Also, pre declaring classes instead of including the corresponding headers has quite a few drawbacks.
replies(2): >>44393023 #>>44394674 #
46. kortilla ◴[] No.44393023{5}[source]
RAII? shared pointers?
47. glandium ◴[] No.44393057[source]
That is kind of surprising. The sqlite "unity" build, has about the same number of lines of C and takes a lot longer than that to compile.
48. ben-schaaf ◴[] No.44393294[source]
Every claim I've seen about unity builds being fast just never rings true to me. I just downloaded the rad debugger and ran the build script on a 7950x (about as fast as you can get). A debug build took 5s, a release build 34s with either gcc or clang.

Maybe it's a MSVC thing - it does seem to have some multi-threading stuff. In any case raddbg non-clean builds take longer than any of my rust projects.

replies(3): >>44394691 #>>44400609 #>>44405178 #
49. cogman10 ◴[] No.44393453{4}[source]
Digging around, looks like it's workload dependent.

For pure computational workloads, it'll be faster. However, anything with heavy allocation will suffer as apparently the gccgo GC and GC related optimizations aren't as good as cgo's.

replies(1): >>44402614 #
50. 1vuio0pswjnm7 ◴[] No.44393629[source]
Alpha. Windows-only.

https://codeload.github.com/EpicGamesExt/raddebugger/tar.gz/...

51. willtemperley ◴[] No.44393659{4}[source]
A lot can be done by the programmer to mitigate slow builds in Swift. Breaking up long expressions into smaller ones and using explicit types where type inference is expensive for example.

I’d like to see tooling for this to pinpoint bottlenecks - it’s not always obvious what’s making builds slow.

replies(2): >>44394713 #>>44396157 #
52. Mawr ◴[] No.44394240[source]
Not really. The root reason behind Go's fast compilation is that it was specifically designed to compile fast. The implementation details are just a natural consequence of that design decision.

Since fast compilation was a goal, every part of the design was looked at through a rough "can this be a horrible bottleneck?", and discarded if so. For example, the import (package) system was designed to avoid the horrible, inefficient mess of C++. It's obvious that you never want to compile the same package more than once and that you need to support parallel package compilation. These may be blindingly obvious, but if you don't think about compilation speed at design time, you'll get this wrong and will never be able to fix it.

As far as optimizations vs compile speed goes, it's just a simple case of diminishing returns. Since Rust has maximum possible perfomance as a goal, it's forced to go well into the diminishing returns territory, sacrificing a ton of compile speed for minor performance improvements. Go has far more modest performance goals, so it can get 80% of the possible performance for only 20% of the compile cost. Rust can't afford to relax its stance because it's competing with languages like C++, and to some extent C, that are willing to go to any length to squeeze out an extra 1% of perfomance.

replies(1): >>44402975 #
53. andrepd ◴[] No.44394609{5}[source]
Isn't something like this blocked on the lack of specialisation?
replies(1): >>44395868 #
54. maccard ◴[] No.44394674{5}[source]
References, for one. Also there’s a huge difference between “avoid templates unless necessary” and “don’t use templates”.
55. maccard ◴[] No.44394691[source]
I use unity builds day in day out. The speed up is an order of magnitude on a 2m+ LOC project.

If you want to see the difference download unreal engine and compile the editor with and without unity builds enabled.

My experience has been the polar opposite of yours - similar size rust projects are an order of magnitude slower than C++ ones. Could you share an example of a project to compare with?

replies(2): >>44397204 #>>44397303 #
56. troupo ◴[] No.44394710[source]
There's also Jonathan Blow's jai where he routinely builds an entire game from scratch in a few seconds (hopefully public beta will be released by the end of this year).
57. ykonstant ◴[] No.44394713{5}[source]
>I’d like to see tooling for this to pinpoint bottlenecks - it’s not always obvious what’s making builds slow.

I second this enthusiastically.

replies(1): >>44395588 #
58. herewulf ◴[] No.44394831[source]
My anecdata would be that the average C++ developer puts includes inside of every header file which includes more headers to the point where everything is including everything else and a single .cpp file draws huge swaths of unnecessary code in and the project takes eons to compile on a fast computer.

That's my 2000s development experience. Fortunately I've spent a good chunk of the 2010s and most of the 2020s using other languages.

The classic XKCD compilation comic exists for a reason.

replies(1): >>44402994 #
59. LtdJorge ◴[] No.44394849{4}[source]
Well, zero-cost abstractions are still abstractions. It’s not junk per-se, but things that will be optimized out if the IR has enough information to safely do so, so basically lots of extra metadata to actually prove to LLVM that these things are safe.
60. LtdJorge ◴[] No.44394858{3}[source]
There is an experimental Cranelift backend[0] for rustc to improve compilation performance in debug builds.

https://github.com/rust-lang/rustc_codegen_cranelift

61. windward ◴[] No.44394891{3}[source]
>Codegen is also a problem. Rust tends to generate a lot more code than C or C++

Wouldn't you say a lot of that comes from the macros and (by way of monomorphisation) the type system?

replies(1): >>44397739 #
62. windward ◴[] No.44394911{4}[source]
>This is unlike C++ templates which are effectively copy-pasting the resolved type with the generic for every occurrence of type resolution.

Even this can lead to unworkable compile times, to the point that code is rewritten.

63. windward ◴[] No.44394947{3}[source]
>Make no nested types

I wouldn't like it that much

64. rowanG077 ◴[] No.44395044[source]
C hardly requires any high effort compile things. No templates, no generics, super simple types, no high level structures.
replies(1): >>44395997 #
65. weinzierl ◴[] No.44395135[source]
This is sometimes called amalgamation and you can do it Rust as well. Either manually or with tools. The point is that apart from very specific niches it is just not a practical approach.

It's not that it can't be done but that it usually is not worth the hassle and our goal should be for compilation to be fast despite not everything being in one file.

Turbo Pascal is a prime example for a compiler that won the market not least because of its - for the time - outstanding compilation speed.

In the same vein, a language can be designed for fast compilation. Pascal in general was designed for single-pass compilation which made it naturally fast. All the necessary forward declarations were a pain though and the victory of languages that are not designed for single-pass compilation proofs that while doable it was not worth it in the end.

66. CryZe ◴[] No.44395154{4}[source]
A ton of that is actually still doing codegen (for the proc macros for example).
67. TZubiri ◴[] No.44395226[source]
I guess you can do that, but if for some reason you needed to compile separately, (suppose you sell the system to a third party to a client, and they need to modify module 1, module 2 and the main loop.) It would be pretty trivial to remove some #include "module3.c" lines and add some -o module3 options to the compiler. Right?

I'm not sure what Rust or docker have to do with this basic issue, it just feels like young blood attempting 2020 solutions before exploring 1970 solutions.

68. badmintonbaseba ◴[] No.44395255{4}[source]
It's partly by the type system. You can implement a std::sort (or slice::sort()) that just delegates to qsort or a qsort-like implementation and have roughly the same compile time performance as just using qsort straight.

But not having to is a win, as the monomorphised sorts are just much faster at runtime than having to do an indirect call for each comparison.

replies(2): >>44397138 #>>44401996 #
69. simonask ◴[] No.44395387{4}[source]
It truly is not about bounds checks. Index lookups are rare in practical Rust code, and the amount of code generated from them is miniscule.

But it _is_ about the sheer volume of stuff passed to LLVM, as you say, which comes from a couple of places, mostly related to monomorphization (generics), but also many calls to tiny inlined functions. Incidentally, this is also what makes many "modern" C++ projects slow to compile.

In my experience, similarly sized Rust and C++ projects seem to see similar compilation times. Sometimes C++ wins due to better parallelization (translation units in Rust are crates, not source files).

70. john-h-k ◴[] No.44395485[source]
My C compiler, which is pretty naive and around ~90,000 lines, can compile _itself_ in around 1 second. Clang can do it in like 0.4.

The simple truth is a C compiler doesn’t need to do very much!

71. glhaynes ◴[] No.44395588{6}[source]
I'll third it. I've started to see more and more cargo culting of "fixes" that I'm extremely suspicious do nothing aside from making the code bulkier.
72. jstanley ◴[] No.44395833[source]
> Go has sub-second build times even on massive code-bases.

Unless you use sqlite, in which case your build takes a million years.

replies(2): >>44396492 #>>44399849 #
73. dwattttt ◴[] No.44395868{6}[source]
I believe the specific advice they're referring to has been stable for a while. You take your generic function & split it into a thin generic wrapper, and a non-generic worker.

As an example, say your function takes anything that can be turned into a String. You'd write a generic wrapper that does the ToString step, then change the existing function to just take a String. That way when your function is called, only the thin outer function is monomorphised, and the bulk of the work is a single implementation.

It's not _that_ commonly known, as it only becomes a problem for a library that becomes popular.

replies(1): >>44397255 #
74. dgb23 ◴[] No.44395997[source]
Are we seeing similar compilation speed when a Rust program doesn't use these types of features?
replies(1): >>44406898 #
75. motorest ◴[] No.44396044[source]
> This seems like a clear case-study that compilation can be incredibly fast (...)

Have you tried troubleshooting a compiler error in a unity build?

Yeah.

replies(1): >>44399102 #
76. the-lazy-guy ◴[] No.44396127{3}[source]
> Once you add static linking to the toolchain (in all of its forms) things get really fucking slow.

Could you expand on that, please? Every time you run dynmically linked program, it is linked at runtime. (unless it explicitly avoids linking unneccessary stuff by dlopening things lazily; which pretty much never happens). If it is fine to link on every program launch, linking at build time should not be a problem at all.

If you want to have link time optimization, that's another story. But you absolutely don't have to do that if you care about build speed.

replies(1): >>44403903 #
77. never_inline ◴[] No.44396157{5}[source]
> Breaking up long expressions into smaller ones

If it improves compile time, that sounds like a bug in the compiler or the design of the language itself.

78. ◴[] No.44396190{4}[source]
79. blizdiddy ◴[] No.44396258{3}[source]
Go is static by default and still fast as hell
replies(1): >>44396404 #
80. benreesman ◴[] No.44396355{3}[source]
The meme that static linking is slow or produces anything other than the best executables is demonstrably false and the result of surprisingly sinister agendas. Get out readelf and nm and PS sometime and do the arithematic: most programs don't link much of glibc (and its static link is broken by design, musl is better at just about everything). Matt Godbolt has a great talk about how dynamic linking actually works that should give anyone pause.

DLLs got their start when early windowing systems didn't quite fit on the workstations of the era in the late 80s / early 90s.

In about 4 minutes both Microsoft and GNU were like, "let me get this straight, it will never work on another system and I can silently change it whenever I want?" Debian went along because it gives distro maintainers degrees of freedom they like and don't bear the costs of.

Fast forward 30 years and Docker is too profitable a problem to fix by the simple expedient of calling a stable kernel ABI on anything, and don't even get me started on how penetrated everything but libressl and libsodium are. Protip: TLS is popular with the establishment because even Wireshark requires special settings and privileges for a user to see their own traffic, security patches my ass. eBPF is easier.

Dynamic linking moves control from users to vendors and governments at ruinous cost in performance, props up bloated industries like the cloud compute and Docker industrial complex, and should die in a fire.

Don't take my word for it, swing by cat-v.org sometimes and see what the authors of Unix have to say about it.

I'll save the rant about how rustc somehow manages to be slower than clang++ and clang-tidy combined for another day.

replies(3): >>44396760 #>>44396875 #>>44396975 #
81. vintagedave ◴[] No.44396404{4}[source]
Delphi is static by default and incredibly fast too.
replies(2): >>44399822 #>>44402941 #
82. Groxx ◴[] No.44396492{3}[source]
Yeah, I deal with multiple Go projects that take a couple minutes to link the final binary, much less build all the intermediates.

Compilation speed depends on what you do with a language. "Fast" is not an absolute, and for most people it depends heavily on community habits. Rust habits tend to favor extreme optimizability and/or extreme compile-time guarantees, and that's obviously going to be slower than simpler code.

83. jelder ◴[] No.44396760{4}[source]
CppCon 2018: Matt Godbolt “The Bits Between the Bits: How We Get to main()"

https://www.youtube.com/watch?v=dOfucXtyEsU

84. jrmg ◴[] No.44396875{4}[source]
…surprisingly sinister agendas.

Dynamic linking moves control from users to vendors and governments at ruinous cost in performance, props up bloated industries...

This is ridiculous. Not everything is a conspiracy!

replies(4): >>44396943 #>>44398118 #>>44399499 #>>44400236 #
85. k__ ◴[] No.44396943{5}[source]
That's an even more reasonable fear than trusting trust, and people seem to take that seriously.
86. duped ◴[] No.44396975{4}[source]
I think you're confused about my comment and this thread - I'm talking about build times.
replies(1): >>44398148 #
87. estebank ◴[] No.44397082{4}[source]
For loops are sugar around an Iterator instantiation:

  for i in 0..10 {}
translates to roughly

  let mut iter = Range { start: 0, end: 10 }.into_iter();
  while let Some(i) = iter.next() {}
88. estebank ◴[] No.44397138{5}[source]
This is a pattern a crate author can rely on (write a function that uses genetics that immediately delegates to a function that uses trait objects or converts to the needed types eagerly so the common logic gets compiled only once), and there have been multiple efforts to have the compiler do that automatically. It has been called polymorphization and it comes up every now and then: https://internals.rust-lang.org/t/add-back-polymorphization/...
89. almostgotcaught ◴[] No.44397204{3}[source]
How many LOC is unreal? I'm trying to estimate whether making LLVM compatible with UNITY_BUILD would be worth the effort.

EDIT: i signed up to get access to unreal so take a look at how they do unity builds and turns out they have their own build tool (not CMake) that orchestrates the build. so does anyone know (can someone comment) whether unity builds for them (unreal) means literally one file for literally all project sources files or if it's "higher-granularity" like UNITY_BUILD in CMake (i.e., single file per object).

replies(2): >>44397636 #>>44399260 #
90. estebank ◴[] No.44397210{5}[source]
Side note: There's an effort to cache proc macro invocations so that they get executed only once if the item they annotate hasn't changed: https://github.com/rust-lang/rust/pull/129102

There are multiple caveats on providing this to users (we can't assume that macro invocations are idempotent, so the new behavior would have to be opt in, and this only benefits incremental compilation), but it's in our radar.

91. estebank ◴[] No.44397255{7}[source]
To illustrate:

  fn foo<S: Into<String>>(s: S) {
      fn inner(s: String) { ... }
      inner(s.into())
  }
replies(1): >>44399834 #
92. ben-schaaf ◴[] No.44397303{3}[source]
> If you want to see the difference download unreal engine and compile the editor with and without unity builds enabled.

UE doesn't use a full unity build, it groups some files together into small "modules". I can see how this approach may have some benefits; you're trading off a faster clean build for a slower incremental build.

I tested compiling UnrealFrontend, and a default setup with the hybrid unity build took 148s. I noticed it was only using half my cores due to memory constraints. I disabled unity and upped the parallelism and got 182s, so 22% slower while still using less memory. A similarly configured unity build was 108s, so best case is ~2x.

On the other hand only changing the file TraceTools/SFilterPreset.cpp resulted in 10s compilation time under a unity build, and only 2s without unit.

I can see how this approach has its benefits (and drawbacks). But to be clear this isn't what projects like raddbg and sqlite3 are doing. They're doing a single translation unit for the entire project. No parallelism, no incremental builds, just a single compiler invocation. This is usually what I've seen people mean by a unity build.

> My experience has been the polar opposite of yours - similar size rust projects are an order of magnitude slower than C++ ones. Could you share an example of a project to compare with?

I just did a release build of egui in 35s, about the same as raddbg's release build. This includes compiling dependencies like wgpu, serde and about 290 other dependencies which add up to well over a million lines of code.

Note I do have mold configured as my linker, which speeds things up significantly.

replies(1): >>44403925 #
93. phplovesong ◴[] No.44397304[source]
Thats not really true. As a counter example, Ocaml has a very advanced type system, full typeinference, generics and all that jazz. Still its on par, or even faster to compile than Go.
94. ◴[] No.44397412{3}[source]
95. slavapestov ◴[] No.44397467{4}[source]
> This has tradeoffs: increased ABI stability at the cost of longer compile times.

Nah. Slow type checking in Swift is primarily caused by the fact that functions and operators can be overloaded on type.

Separately-compiled generics don't introduce any algorithmic complexity and are actually good for compile time, because you don't have to re-type check every template expansion more than once.

replies(2): >>44399893 #>>44399968 #
96. Culonavirus ◴[] No.44397636{4}[source]
At least 10M (from what I remember, maybe more now)
97. jandrewrogers ◴[] No.44397739{4}[source]
Modern C++ in particular does a lot of similar, albeit not identical, codegen due to its extensive metaprogramming facilities. (C is, of course, dead simple.) I've never looked into it too much but anecdotally Rust does seem to generate significantly more code than C++ in cases where I would intuitively expect the codegen to be similar. For whatever reason, the "in theory" doesn't translate to "in practice" reliably.

I suspect this leaks into both compile-time and run-time costs.

98. benreesman ◴[] No.44398118{5}[source]
I didn't say anything was a conspiracy, let alone everything. I said inferior software is promoted by vendors on Linux as well as on MacOS and Windows with unpleasant consequences for users in a way that serves those vendors and the even more powerful institutions to which they are beholden. Sinister intentions are everywhere in this business (go read the opinions of the people who run YC), that's not even remotely controversial.

If fact, if there was anything remotely controversial about a bunch of extremely specific, extremely falsifiable claims I made, one imagines your rebuttal would have mentioned at least one.

I said inflmatory things (Docker is both arsonist and fireman at ruinous cost), but they're fucking true. That Alpine in the Docker jank? Links musl!

99. benreesman ◴[] No.44398148{5}[source]
You said something false and important and I took the opportunity to educate anyone reading about why this aspect of their computing experience is a mess. All of that is germane to how we ended up in a situation where someone is calling rustc with a Dockerfile and this is considered normal.
replies(1): >>44400814 #
100. moffkalast ◴[] No.44399102[source]
It compiles in 2 seconds! Does it run? No, but it was fast!
replies(1): >>44414730 #
101. maccard ◴[] No.44399260{4}[source]
The build tool groups files into a roughly equivalent size based on file length, and dispatches compiles those in parallel.
replies(1): >>44399410 #
102. almostgotcaught ◴[] No.44399410{5}[source]
how many groups do people usually use to get a fast build (alternatively what is the group size)?
replies(1): >>44400955 #
103. computably ◴[] No.44399499{5}[source]
Bad incentives != conspiracy
104. zenlot ◴[] No.44399822{5}[source]
FreePascal to the game please
replies(1): >>44401261 #
105. ◴[] No.44399834{8}[source]
106. infogulch ◴[] No.44399849{3}[source]
Try https://github.com/ncruces/go-sqlite3 it runs sqlite in WASM with wazero, a pure Go WASM runtime, so it builds without any CGo required. Most of the benchmarks are within a few % of the performance of mattn/go-sqlite3.
107. fingerlocks ◴[] No.44399893{5}[source]
You’re absolutely right. I realized this later but it was too late to edit the post.
108. pclmulqdq ◴[] No.44399907{3}[source]
Go defaults to an unoptimized build. If you want it to run heavy optimization passes, you can turn those on with flags. Rust defaults to doing most of those optimizations on every build and allows you to turn them off.
109. choeger ◴[] No.44399968{5}[source]
Separate compilation also enables easy parallelization of type checking.
110. trinix912 ◴[] No.44400236{5}[source]
Had they left "governments" out of there it would've been almost fine, but damn I didn't know it's now governments changing DLLs for us!
replies(1): >>44400720 #
111. taylorallred ◴[] No.44400609[source]
This is true. After making my earlier comment, I went home and tested MSVC and Clang and got similar numbers. I had 1.5s in my head from using it earlier but maybe some changes made it slower. Either way, it's a lot of code and stays on the order of seconds or tens of seconds rather than minutes.
112. benreesman ◴[] No.44400720{6}[source]
https://en.wikipedia.org/wiki/Equation_Group

https://en.wikipedia.org/wiki/Advanced_persistent_threat

https://en.wikipedia.org/wiki/Operation_Olympic_Games

https://simple.wikipedia.org/wiki/Stuxnet

https://en.wikipedia.org/wiki/Cozy_Bear

https://en.wikipedia.org/wiki/Fancy_Bear

https://en.wikipedia.org/wiki/Tailored_Access_Operations

113. duped ◴[] No.44400814{6}[source]
Seems like you still misunderstand both the comment and context and getting overly emotional/conspiratorial. You might want to work on those feelings.
replies(1): >>44401034 #
114. maccard ◴[] No.44400955{6}[source]
It’s about 300kb before pre processor expansion by default. I’ve never changed it.
115. benreesman ◴[] No.44401034{7}[source]
No one is trying to take anyone's multi-gigabyte pile of dynamic library closure to deploy what should be a few hundred kilobytes of arbitrarily portable, secure by construction, built to last executable.

But people should make an informed choice, and there isn't any noble or high minded or well-meaning reason to try to shout that information down.

Don't confidently assert falsehoods unless you're prepared to have them refuted. You're entitled to peddle memes and I'm entitled to reply with corrections.

116. tukantje ◴[] No.44401261{6}[source]
Will the real FORTRAN please stand up?
117. barchar ◴[] No.44401496[source]
Rust does do this. The unit of compilation is the whole crate and the compiler creates appropriately sized chunks of LLVM IR to balance duplicate work and incrementality.

Rust is generally faster to compile on a per-sourceline basis than c++. But rust projects compile all their dependencies as well.

118. teleforce ◴[] No.44401934[source]
>When you have things like macros, advanced type systems, and want robustness guarantees at build time.. then you have to pay for that.

Go and Dlang compilers were designed by those that are really good at compiler design and that's why they're freaking fast. They designed the language around the compiler constraints and at the same managed to make the language intuitive to use. For examples, Dlang has no macro and no unnecessary symbols look-up for the ambiguous >>.

Because of these design decisions both Go and Dlang are anomaly for fast compilation. Dlang in particular is notably more powerful and expressive compared to C++ and Rust even with its unique hybrid GC and non-GC compilation.

In automotive industry it's considered a breakthrough and game changing achievement if you have a fast transmission for seamless auto and manual transmission such as found in the latest Koenigsegg hypercar [1]. In programming industry however, nobody seems to care. Walter Bright the designer of Dlang has background in mechanical engineering and it shows.

[1] Engage Shift System: Koenigsegg new hybrid manual and automatic gearbox in CC850:

https://www.topgear.com/car-news/supercars/heres-how-koenigs...

replies(1): >>44402967 #
119. rstuart4133 ◴[] No.44401996{5}[source]
All true, but given the number of "Rust compile time is slow" posts that blame the compiler I think it's safe to say most programmers don't understand the real underlying trade-off that's causes it.

Not all programmers of course - if you look at std there are many places that split types into generic and non-generic parts so the compiler will reuse as much code as possible, but it does come at the cost of additional complexity. Worse if you aren't already aware of why they are doing it, the language does a marvellous job of hiding the reason that complexity is there. I'd wager a novice Rust programmer is as befuddled by it as a JavaScript programmer coming across his first free() call in C.

I have this dream of a language like Rust that makes the trade-off plain, so the programmer is always aware of "this is a zero cost abstraction - you're just making it plain via the type system your doing the right thing" and "I'm going to have to generate a lot of code for this". Then go a step further and put the types and source you want to export to other libraries in a special elf section in the .so so you don't need the source to link against it, then go another step further and make the programmer using the .so explicitly instantiate any stuff that does require a lot of code generated so he is aware of what is happening.

That said, I don't think it would help the compile time problem in most cases. C++ already does something close by forcing you to put exported stuff in .h files, and they ended up with huge .h files and slow compiles anyway.

Nevertheless doing that would make for a Rust like language, that, unlike Rust, supported an eco-system of precompiled libraries just like C does. Rust is so wedded to transparent monomorphisation it looks near impossible now.

120. haiku2077 ◴[] No.44402614{5}[source]
As of about five years ago gccgo was slower for almost all workloads: https://meltware.com/2019/01/16/gccgo-benchmarks-2019.html
121. bjoli ◴[] No.44402705[source]
Chez scheme handles macros that expand to macros that expand to tens of thousands of lines of code in a breeze. Macros don't have to be slow.
122. pjmlp ◴[] No.44402941{5}[source]
And D, Active Oberon, Eiffel,....

Go got famous compile times, because for a decade a new generation educated in scripting languages and used to badly configured C and C++ projects, took for innovation, what was actually a return to old values in compiler development.

123. pjmlp ◴[] No.44402957{5}[source]
That is something I never have to care on my C++ projects, because I always make use of binary libraries, unless I am forced to compile from source.

Unfortunately that doesn't seem to ever be a scenario cargo will support out of the box.

replies(1): >>44409017 #
124. pjmlp ◴[] No.44402967{3}[source]
It isn't an anomaly, it was pretty standard during the 1990's, until C and C++ took over all other compiled languages, followed by a whole generation educated in scripting languages.
125. pjmlp ◴[] No.44402975{3}[source]
C++23 modules compile really fast now, moreso when helped with incremental compilation, incremental linking, and a culture that has not any qualms depending on binary libraries.
126. pjmlp ◴[] No.44402983{4}[source]
C++23 modules make 2. a moot point.

Same applies to point 3, when used alongside external templates.

127. pjmlp ◴[] No.44402984{4}[source]
Package most used scenarios in external templates on a binary library.
128. pjmlp ◴[] No.44402994{3}[source]
The newbie C++ developer, unaware of pre-compiled headers, binary libraries, external templates.

Most likely treating C and C++ as scripting languages with header only libraries, only to complain about build times afterwards.

129. 1718627440 ◴[] No.44403903{4}[source]
Reading your comment is sounds like the opposite would be true, because so much linking would be needed to be done at runtime. But that perception fails to realize, that when claiming an executable is linked dynamically, most symbols were also statically linked. It is only the few public exported symbols that are dynamically linked, because there are deemed to be a reasonable separate concern, that should be handled by someone elses codebase.

I think lazily linking is the default even if you don't use dlopen, i.e. every symbol gets linked upon first use. Of course that has the drawback, that the program can crash due to missing/incompatible libraries in the middle of work.

replies(1): >>44405021 #
130. maccard ◴[] No.44403925{4}[source]
I think the definition of unity build is sufficiently unclear that it can be used as a pro or a con. It’s one of the problems with c++ - there’s no default build tool to compare with. II’d argue that a single unity blob with one compiler invocation isn’t a fair comparison and that when you made a token effort to configure the unity build you saw a 2x speed up.

I’m the tech director for a game studio using unreal and we spec the dev machines with enough memory to avoid that pressure - we require 2.5GB/Core, rounded up. Right now we’re using i9 14900k’s with 128GB RAM. We were on 5950x with 64GB before this.

I tried egui last night but failed to build it due to missing dependencies, but I’ll see if I can get it running on Monday. I’d also love to see what raddbg would be like if we split it into even 4 compilation units

replies(1): >>44405080 #
131. AdelaideSimone ◴[] No.44405021{5}[source]
A lot of vendors use non-lazy binding for security reasons, and some platforms don't support anything other than RTLD_NOW (e.g., Android).

Anyway, while what you said is theoretically half-true, a fairly large number of libraries are not designed/encapsulated well. This means almost all of their symbols are exported dynamically, so, the idea that there are only "few public exported symbols" is unfortunately false.

However, something almost no one ever mentions is that ELF was actually designed to allow dynamic libraries to be fairly performant. It isn't something I would recommend, as it breaks many assumptions on Unices, (while you don't get the benefits of LTO) you can achieve code generation almost equivalent to static linking by using something like "-fno-semantic-interposition -Wl,-Bsymbolic,-z,now". MaskRay has a good explanation on it: https://maskray.me/blog/2021-05-16-elf-interposition-and-bsy...

132. ben-schaaf ◴[] No.44405080{5}[source]
> II’d argue that a single unity blob with one compiler invocation isn’t a fair comparison and that when you made a token effort to configure the unity build you saw a 2x speed up.

Most certainly an unfair comparison; I did not know that "unity build" also referred to a hybrid setup. I've only seen people advocate for a single translation unit approach, and that's always just plain slow.

Personally I'll almost always prefer faster incremental compilation; thankfully my work's clean builds only take 30 seconds.

> I’d also love to see what raddbg would be like if we split it into even 4 compilation units

Should be a minimum of 32x faster just building in parallel with our hardware ;)

Unfortunately raddbg suffers from unity-build-syndrome: It has headers but they're missing definitions, and .c files are missing includes.

replies(1): >>44405559 #
133. davikr ◴[] No.44405178[source]
it's fast to build with msvc.
134. maccard ◴[] No.44405559{6}[source]
If my clean builds were 30 seconds I’d prefer incremental performance too. Our precompiled headers take longer than that. A clean ue5 editor build with precompiled headers, adaptive unity builds and optimisations disabled takes 20 minutes or so. Incremental changes to game projects are usually prettt quick - under 5 seconds or so (which is fast enough when the whole thing takes 30+ seconds to boot anyway.

> Unfortunately raddbg suffers from unity-build-syndrome: It has headers but they're missing definitions, and .c files are missing includes.

We run a CI build once a week with unity disabled to catch these issues.

> Should be a minimum of 32x faster just building in parallel with our hardware ;)

My experience here is that parallelism is less than linear and avoiding repeated work is super-linear - there’s a happy medium. If you had 128 files and 32 cores it’s probably fast to compile in parallel, but likely even faster if you can just dispatch 32 “blobs” and link them at the end.

135. rowanG077 ◴[] No.44406898{3}[source]
You can't disable those features. There is no "no type-checking" mode. Or "no borrow-checking" mode. It's not opt-out.
136. ChadNauseam ◴[] No.44409017{6}[source]
The actual reason that you don't have to care about this on your C++ project is because C++ doesn't let you define macros in C++, you can only define them in the preprocessor language. Therefore no compilation is needed to execute them.
replies(1): >>44410977 #
137. pjmlp ◴[] No.44410977{7}[source]
I never write macros in C++, other than header guards, for years now.

I belong to the school of thought that C style macros in C++ should be nuked.

Exception being source code I don't own.

138. Sinidir ◴[] No.44414730{3}[source]
Pipe to dev/null. Fastest database i have ever used.
replies(1): >>44420730 #
139. moffkalast ◴[] No.44420730{4}[source]
/dev/null is web scale