←back to thread

92 points endorphine | 2 comments | | HN request time: 0.417s | source
Show context
Brian_K_White ◴[] No.43536927[source]
They let the programmer be the ultimate definer of correctness.

They don't prioritize performance over correctness, they prioritize programmer control over compiler/runtime control.

replies(2): >>43536968 #>>43537845 #
hn-acct ◴[] No.43536968[source]
But they don’t really when some compilers silently remove code without mentioning it.
replies(1): >>43537170 #
Calavar ◴[] No.43537170[source]
The compiler removes code under the assumption that your code doesn't have UB. If your code has UB, that's a bug. "When my code is buggy the compiler outputs a buggy executable, but it's buggy in a different way than I want" has always struck me as somewhat of an odd complaint.

Of course it can be difficult to know when you've unintentionally hit UB, which leaves room for footguns. This is probably an unpopular opinion, but to me that's not an argument for rolling back UB-based optimizations; it's an argument for better diagnostics (are you *sure* you meant to do this), rigorous testing, and for eliminating some particularly tricky instances of UB in future revisions of the standard.

replies(2): >>43537387 #>>43539105 #
adgjlsfhk1 ◴[] No.43537387[source]
the problem is that any ub that is too difficult for a compiler to turn into a compile time error is also too difficult for humans to reliably prevent.
replies(1): >>43538038 #
Calavar ◴[] No.43538038[source]
UB can't be a compile time error though. And I don't mean it's too hard because it would require Turing complete/undecidable compile time analysis, I mean a compile time error for UB would be a violation of the language contract with the programmer. The programmer can say trust me, I don't need bounds checking in this function because the caller ensures that the index is in bounds. And this could actually be a safe assumption if, let's say, this function will only ever be called by machine generated code. You can't statically analyze the presence/absence of UB there, even if you magic a way around the decidability problem, because you don't know if the programmer was right that all inputs the function will ever see are guaranteed to be safe.

C++ has two issues with UB: 1) potentially UB operations are on by default and not opt-in at a granular level and 2) there are a whole lot of them.

Rust has shown that mortals can write robust code in a language with UB if unsafe operations are opt-in and number of rules is small enough that you can write tooling to help catch errors (miri). And Rust can be much more aggressive than GCC or clang at making UB-based optimizations, particularly when it comes to things like pointer aliasing.

replies(1): >>43546268 #
1. adgjlsfhk1 ◴[] No.43546268[source]
I don't think this is true. the standard says that valid code does not run UB, do of the compiler can prove that code runs UB, it is invalid code
replies(1): >>43561454 #
2. Calavar ◴[] No.43561454[source]
> the compiler can prove that code runs UB, it is invalid code

The *potential* to hit UB is not invalid according to the standard. If it was, then an unchecked array access would be illegal.

If every possible input state for an operation hits UB, then sure the compiler could diagnose that and emit an error. But apart from stuff like type punning, how often does that happen in the real world?

Real world UB pops up in corner cases that weren't considered carefully: What you get weird input and your divisor ends up being zero? What if your string or list ends up being empty? Or, like the example in the article, what if your std::sort predicate encounters a NaN? You might not realize that your code has these corner cases, but the compiler does, and it optimizes assuming that you're aware of these corner cases and assuring it that the executable will never actually hit those corner cases. The compiler can't emit errors for that because it's not invalid code - it might be the intended behavior. The best it can do is a warning.

It boils down to the programmer-language contract on UB being implicit and difficult to grok. Compilers can't fix that.