←back to thread

392 points mfiguiere | 4 comments | | HN request time: 0.001s | source
Show context
bogwog ◴[] No.35471515[source]
I feel so lucky that I found waf[1] a few years ago. It just... solves everything. Build systems are notoriously difficult to get right, but waf is about as close to perfect as you can get. Even when it doesn't do something you need, or it does things in a way that doesn't work for you, the amount of work needed to extend/modify/optimize it to your project's needs is tiny (minus the learning curve ofc, but the core is <10k lines of Python with zero dependencies), and doesn't require you to maintain a fork or anything like that.

The fact that the Buck team felt they had to do a from scratch rewrite to build the features they needed just goes to show how hard it is to design something robust in this area.

If there are any people in the Buck team here, I would be curious to hear if you all happened to evaluate waf before choosing to build Buck? I know FB's scale makes their needs unique, but at least at a surface level, it doesn't seem like Buck offers anything that couldn't have been implemented easily in waf. Adding Starlark, optimizing performance, implementing remote task execution, adding fancy console output, implementing hermetic builds, supporting any language, etc...

[1]: https://waf.io/

replies(7): >>35471805 #>>35471941 #>>35471946 #>>35473733 #>>35474259 #>>35476904 #>>35477210 #
klodolph ◴[] No.35474259[source]
> If there are any people in the Buck team here, I would be curious to hear if you all happened to evaluate waf before choosing to build Buck?

There’s no way Waf can handle code bases as large as the ones inside Facebook (Buck) or Google (Bazel). Waf also has some problems with cross-compilation, IIRC. Waf would simply choke.

If you think about the problems you run into with extremely large code bases, then the design decisions behind Buck/Bazel/etc. start to make a lot of sense. Things like how targets are labeled as //package:target, rather than paths like package/target. Package build files are only loaded as needed, so your build files can be extremely broken in one part of the tree, and you can still build anything that doesn’t depend on the broken parts. In large code bases, it is simply not feasible to expect all of your build scripts to work all of the time.

The Python -> Starlark change was made because the build scripts need to be completely hermetic and deterministic. Starlark is reusable outside Bazel/Buck precisely because other projects want that same hermeticity and determinism.

Waf is nice but I really want to emphasize just how damn large the codebases are that Bazel and Buck handle. They are large enough that you cannot load the entire build graph into memory on a single machine—neither Facebook nor Google have the will to load that much RAM into a single server just to run builds or build queries. Some of these design decisions are basically there so that you can load subsets of the build graph and cache parts of the build graph. You want to hit cache as much as possible.

I’ve used Waf and its predecessor SCons, and I’ve also used Buck and Bazel.

replies(3): >>35475404 #>>35475425 #>>35476956 #
bogwog ◴[] No.35475425[source]
I get that, but again, there's no reason Waf can't be used as a base for building that. I actually use Waf for cross compilation extensively, and have built some tools around it with Conan for my own projects. Waf can handle cross compilation just fine, but it's up to you to build what that looks like for your project (a common pattern I see is custom Context subclasses for each target)

Memory management, broken build scripts, etc. can all be handled with Waf as well. In the simplest case, you can just wrap a `recurse` call in a try catch block, or you can build something much more sophisticated around how your projects are structured.

Note, I'm not trying to argue that Google/Facebook "should have used X". There are a million reasons to pick X over Y, even if Y is the objectively better choice. Sometimes, molding X to be good enough is more efficient than spending months just researching options hoping you'll find Y.

I'm just curious to know if they did evaluate Waf, why did they decide against it.

replies(2): >>35476812 #>>35476905 #
davnn ◴[] No.35476812[source]
> the core is <10k lines of Python with zero dependencies

Isn‘t that already a no-go, to write a performance critical system in a slow programming language?

replies(1): >>35476929 #
taeric ◴[] No.35476929[source]
I am no python fan, but find it laughably hard to believe it could be what makes a build coordination system slow.
replies(1): >>35478403 #
1. Too ◴[] No.35478403[source]
On clean builds the python tax will be dwarfed be the thousands of calls to clang yes. That’s not the scenario you need to optimize for. What’s more important is that incremental builds are snappy, since that is what developers do 100 times per day.

I’ve seen some projects with 100MB+ ninja-files that even ninja itself, proud for being written in optimized c++, takes a second or two to parse on each build invocation. Convert that to python and you likely land in 5-20 sec range instead. Enough to alt-tab and get distracted by something else. Google code base is likely even larger than this.

A background daemon that holds the graph in memory would probably handle it. In the big scheme such a design is likely better anyway. But needs a big upfront design and is a lot more complex than just reparsing a file each time.

Side note: For some, even the interpreter startup is annoying. Personally I find it negligible, especially after 3.11 you can almost claim it’s snappy.

replies(1): >>35481376 #
2. taeric ◴[] No.35481376[source]
Code bases that big are strawmen for most companies. Yes, they happen; but as often they should be segmented into smaller things. That don't require monolithic build setups.
replies(1): >>35484881 #
3. joshuamorton ◴[] No.35484881[source]
The context for this thread was weather Facebook considered waf in participation, so it is very relevant.
replies(1): >>35486237 #
4. taeric ◴[] No.35486237{3}[source]
Certainly fair. I had meandered on to "in general" way too quickly.