Tree Borrows

(plf.inf.ethz.ch)

573 points zdw | 1 comments | 09 Jul 25 14:40 UTC | HN request time: 0.388s | source

Show context

jcalvinowens ◴[09 Jul 25 18:21 UTC] No.44513250[source]▶

> On the one hand, compilers would like to exploit the strong guarantees of the type system—particularly those pertaining to aliasing of pointers—in order to unlock powerful intraprocedural optimizations.

How true is this really?

Torvalds has argued for a long time that strict aliasing rules in C are more trouble than they're worth, I find his arguments compelling. Here's one of many examples: https://lore.kernel.org/all/CAHk-=wgq1DvgNVoodk7JKc6BuU1m9Un... (the entire thread worth reading if you find this sort of thing interesting)

Is Rust somehow fundamentally different? Based on limited experience, it seems not (at least, when unsafe is involved...).

replies(11): >>44513333 #>>44513357 #>>44513452 #>>44513468 #>>44513936 #>>44514234 #>>44514867 #>>44514904 #>>44516742 #>>44516860 #>>44517860 #

jandrewrogers ◴[09 Jul 25 19:31 UTC] No.44513936[source]▶

>>44513250 #

Strict aliasing rules are useful conditional on them being sufficiently expressive and sensible, otherwise they just create pointless headaches that require kludgy workarounds or they are just disabled altogether. I don't think there is much disagreement that C strict aliasing rules are pretty broken. There is no reason a language like Rust can't be designed with much more sensible strict aliasing rules. Even C++ has invested in providing paths to more flexibility around strict aliasing than C provides.

But like Linus, I've noticed it doesn't seem to make much difference outside of obvious narrow cases.

replies(1): >>44517397 #

gronpi ◴[10 Jul 25 05:02 UTC] No.44517397[source]▶

>>44513936 #

>There is no reason a language like Rust can't be designed with much more sensible strict aliasing rules.

The aliasing rules of Rust for mutable references are different and more difficult than strict aliasing in C and C++.

Strict aliasing in C and C++ are also called TBAA, since they are based on compatible types. If types are compatible, pointers can alias. This is different in Rust, where mutable references absolutely may never alias, not even if the types are the same.

Rust aliasing is more similar to C _restrict_.

The Linux kernel goes in the other direction and has strict aliasing optimization disabled.

replies(1): >>44518204 #

ralfj ◴[10 Jul 25 07:12 UTC] No.44518204[source]▶

>>44517397 #

> The aliasing rules of Rust for mutable references are different and more difficult than strict aliasing in C and C++.

"more difficult" is a subjective statement. Can you substantiate that claim?

I think there are indications that it is less difficult to write aliasing-correct Rust code than C code. Many major C codebeases entirely give up on even trying, and just set "-fno-strict-aliasing" instead. It is correct that some types are compatible, but in practice that doesn't actually help very much since very few types are compatible -- a lot of patterns people would like to write now need extra copies via memcpy to avoid strict aliasing violations, costing performance.

In contrast, Rust provides raw pointers that always let you opt-out of aliasing requirements (if you use them consistently); you will never have to add extra copies. Miri also provides evidence that getting aliasing right is not harder than dealing with other forms of UB such as data races and uninitialized memory (with Tree Borrows, those all occur about the same amount).

I would love to see someone write a strict aliasing sanitizers and run it on popular C codebases. I would expect a large fraction to be found to have UB.

replies(1): >>44518522 #

hamcocar ◴[10 Jul 25 08:05 UTC] No.44518522[source]▶

>>44518204 #

I am very sorry, but your arguments here are terrible.

Unless casting or type punning through unions are used, the type system should help a lot in terms of avoiding using pointers to types that are not compatible in C. And then special care can be taken in any cases where casts are used. C++ is probably better at avoiding type casts, with all the abstractions it has.

This is different from no aliasing of Rust, where mutable references of even the same type may not alias.

Your own tool, Miri, reports that this fairly simple code snippet is UB, even though it is only the raw pointer that is dereferenced, and "a2" is not even read after assignment.

https://play.rust-lang.org/?version=stable&mode=debug&editio...

And you know better than me that Miri cannot handle everything. And Miri is slow to run, which is normal for that kind of advanced tool, not a demerit against Miri but against the general kind of tool it is.

I am very surprised that you come with arguments this poor.

replies(3): >>44518942 #>>44518975 #>>44519103 #

1. jojomodding ◴[10 Jul 25 09:10 UTC] No.44518942[source]▶

>>44518522 #

> Unless casting or type punning through unions are used, the type system should help a lot in terms of avoiding using pointers to types that are not compatible in C

This is simply wrong. For once, C's type system does not help you here at all. Consider the following code:

  float* f = ...;
  void* v = f;
  long* i = v;
  // Code using both *i and *f

This code has undefined behavior due to TBAA. Evidently, no unions are used in it. The type system also inserts implicit casts which can be hard to spot. This issue is not theoretical, the snippet above is taken from Quake 3's <https://en.wikipedia.org/wiki/Fast_inverse_square_root>.

Further, you just can't seriously argue that C's type system helps you avoid UB in a thread about Rust. Rust's type system is the one that helps you avoid UB, and it's just so much better at that.

The code you mentions has UB, yes, but for reasons entirely unrelated to aliasing. You're reading from the address literal 0x0000000b, which is unsurprisingly not a live allocation. It's equivalent to the following C code (which similarly has UB).

  printf("%d", *(int*)(0x0000000b));

The first rule of writing safe code in Rust is "don't use unsafe." This rule is iron-clad (up to compiler bugs). You broke that rule. The second rule of writing safe code in Rust is "if you use unsafe, know what you're doing." You also broke that rule since the Rust code you wrote is probably not you wanted it to be.

But the implications of the second rule are also that you should know the aliasing model, or at least the over-approximation of "do not mix references and pointers." If you use raw pointers everywhere, you won't run into aliasing bugs.

> This is different from no aliasing of Rust, where mutable references of even the same type may not alias.

Aliasing in Rust is simpler when you follow the first rule, since everything is checked by the compiler. And if you use unsafe code with raw pointers, things are still simpler than in C since there is no TBAA. Only if you mix references and pointers do you get into territory where you need to know the aliasing model.

↑