←back to thread

Tree Borrows

(plf.inf.ethz.ch)
564 points zdw | 1 comments | | HN request time: 0.215s | source
Show context
jcalvinowens ◴[] No.44513250[source]
> On the one hand, compilers would like to exploit the strong guarantees of the type system—particularly those pertaining to aliasing of pointers—in order to unlock powerful intraprocedural optimizations.

How true is this really?

Torvalds has argued for a long time that strict aliasing rules in C are more trouble than they're worth, I find his arguments compelling. Here's one of many examples: https://lore.kernel.org/all/CAHk-=wgq1DvgNVoodk7JKc6BuU1m9Un... (the entire thread worth reading if you find this sort of thing interesting)

Is Rust somehow fundamentally different? Based on limited experience, it seems not (at least, when unsafe is involved...).

replies(11): >>44513333 #>>44513357 #>>44513452 #>>44513468 #>>44513936 #>>44514234 #>>44514867 #>>44514904 #>>44516742 #>>44516860 #>>44517860 #
Validark ◴[] No.44517860[source]
Personally, I would like compilers to better exploit vectorization, which can get you 2x to 10x faster on random things within typical workloads, rather than worry about dubious optimizations that have performance improvements that may or may not be caused by changing the alignment of code and data blocks.

I would like to see more effort dedicated to basic one liners that show up in real code like counting how many of a given character are in a string. E.g. `for (str) |e| count += e == '%'`. For this, LLVM spits out a loop that wants to do horizontal addition every iteration on x86-64 targets with vectors, at least. Let's focus on issues that can easily net a 2x performance gain before going after that 1-2% that people think pointer aliasing gets you.

replies(2): >>44518139 #>>44518211 #
1. imtringued ◴[] No.44518211[source]
Pointer aliasing is necessary for auto vectorization, because you can't perform SIMD if the data you're modifying overlaps with the data you're reading and since the compiler is only allowed to modify the code in a way that is legal for all inputs, it will be conservative and refuse to vectorize your code rather than break it in situations with pointer aliasing.

Maybe this was a too convoluted way of saying this:

Loading something from main memory into a register creates a locally cached copy. This mini cache needs to be invalidated whenever a pointer can potentially write to the location in main memory that this copy is caching. In other words, you need cache synchronization down to the register level, including micro architectural registers that are implementation details of the processor in question. Rather than do that, if you can prove that you are the exclusive owner of a region in memory, you know that your copy is the most up to date version and you know when it gets updated or not. This means you are free to copy the main memory into your vector register and do anything you want, including scalar pointer writes to main memory, since you know they are unrelated and will not invalidate your vector registers.