←back to thread

Tree Borrows

(plf.inf.ethz.ch)
564 points zdw | 2 comments | | HN request time: 0.73s | source
Show context
jcalvinowens ◴[] No.44513250[source]
> On the one hand, compilers would like to exploit the strong guarantees of the type system—particularly those pertaining to aliasing of pointers—in order to unlock powerful intraprocedural optimizations.

How true is this really?

Torvalds has argued for a long time that strict aliasing rules in C are more trouble than they're worth, I find his arguments compelling. Here's one of many examples: https://lore.kernel.org/all/CAHk-=wgq1DvgNVoodk7JKc6BuU1m9Un... (the entire thread worth reading if you find this sort of thing interesting)

Is Rust somehow fundamentally different? Based on limited experience, it seems not (at least, when unsafe is involved...).

replies(11): >>44513333 #>>44513357 #>>44513452 #>>44513468 #>>44513936 #>>44514234 #>>44514867 #>>44514904 #>>44516742 #>>44516860 #>>44517860 #
jcranmer ◴[] No.44514867[source]
Take anything Linus says about compilers with a grain of salt--he writes OS kernels, not compilers, and those are pretty different domains.

Alias analysis is extremely important for getting good performance these days--but it should also be remembered that the biggest benefits accrue from the simplest heuristics (like two loads that use the same SSA value as the pointer must alias each other). In LLVM terms, that's BasicAA: a collection of very simple heuristics that primarily amounts to "if we can track down the allocation sites of objects, we can definitively resolve these alias queries; otherwise, we don't know."

The real question that you're trying to ask, though, is what is the value of alias analyses that go beyond the most basic, obvious tests. At the point where the alias queries are no longer trivial to solve, then it's generally the case that what you can do as a result of those queries also shrinks dramatically, pretty much to looking for code motion hazards, and the benefits you get from that are much reduced. One of the experiments I would like to do is measure the total speedup you'd get from a theoretically perfect alias analysis, and my guess is that it's somewhere in the 20% range even on non-HPC code like the Linux kernel [1].

[1] This doesn't account for the heroic optimizations, such as data-layout transformations, that you wouldn't attempt to write without a very high-quality alias analysis. But since we already know that alias analysis doesn't exist in practice, we're not going to attempt those optimizations anyways, so it's not worth including such stuff in prospective speed gains.

replies(3): >>44515164 #>>44517072 #>>44517943 #
1. jcalvinowens ◴[] No.44517072[source]
> Take anything Linus says about compilers with a grain of salt

I think he's making an argument about CPU behavior here more than about compilers: if we call loads and stores which aliasing optimizations might remove "redundant", he's saying that modern machines with big caches and store buffers make those "redundant" operations so cheap they don't matter in practice for most workloads.

Of course, it's admittedly an oversimplification to reduce aliasing optimizations to simply eliminating loads and stores, you described things which go beyond that.

However, I just ran a silly little test where I added `-fno-strict-aliasing` to CFLAGS and rebuilt the world on one of my gentoo build machines, it only got 1.5% slower at compiling an allmodconfig linux kernel (gcc-14):

    Before: 14m39.101s 14m42.462s 14m41.497s 14m44.540s
    After:  14m54.354s 14m54.415s 14m55.580s 14m55.793s
That's on a shiny new znver5.
replies(1): >>44519944 #
2. sapiogram ◴[] No.44519944[source]
Very cool benchmark, thanks!