> It's because the UB must be continuously exploited by compilers for that extra 1% perf gain.
Your framing of a compiler exploiting UB in programs to gain performance, has an undeserved negative connotation. The fact is, programs are mathematical structures/arguments, and if any single step in the program code or execution is wrong, no matter how small, it can render the whole program invalid. Drawing from math analogies where one wrong step leads to an absurd conclusion:
* https://en.wikipedia.org/wiki/All_horses_are_the_same_color
* https://en.wikipedia.org/wiki/Principle_of_explosion
* https://proofwiki.org/wiki/False_Statement_implies_Every_Sta...
* https://en.wikipedia.org/wiki/Mathematical_fallacy#Division_...
Back to programming, hopefully this example will not be controversial: If a program contains at least one write to an arbitrary address (e.g. `*(char*)0x123 = 0x456;`), the overall behavior will be unpredictable and effectively meaningless. In this case, I would fully agree with a compiler deleting, reordering, and manipulating code as a result of that particular UB.
You could argue that C shouldn't have been designed so that reading out of bounds is UB. Instead, it should read some arbitrary value without crashing or cleanly segfault at that instruction, with absolutely no effects on any surrounding code.
You could argue that C/C++ shouldn't have made it UB to dereference a null pointer for reading, but I fully agree that dereferencing a null pointer for a method call or writing a field must be UB.
Another analogy in programming is, let's forget about UB. Let's say you're writing a hash table in Java (in the normal safe subset without using JNI or Unsafe). If you get even one statement wrong in the data structure implementation, there still might be arbitrarily large consequences like dropping values when you shouldn't, miscounting how many values exist, duplicating values when you shouldn't, having an incorrect state that causes subtle failures far in the future, etc. The consequences are not as severe and pervasive as UB at the language level, but it will still result in corrupt data and/or unpredictable behavior for the user of that library code, which can in turn have arbitrarily large consequences. I guess the only difference compared to C/C++ UB is that for C/C++, there is more "spooky action at a distance", where some piece of UB can have very non-local consequences. But even incorrect code in safe Java can produce large consequences, maybe just not as large on average.
I am not against compilers "exploiting" UB for performance gain. But these are the ways forward that I believe in, for any programming language in general:
* In the language specification, reduce the number of cases/places that are undefined. Not only does it reduce the chances of bad things happening, but it also makes the rules easier to remember for humans, thus making it easier to avoid triggering these cases.
* Adding to that point, favor compile-time errors over run-time UB. For example, reading from an uninitialized local variable is a compile error in Java but UB in C. Rust's whole shtick about lifetimes and borrowing is one huge transformation of run-time problems into compile-time problems.
* Overwhelmingly favor safety by default. For example, array accesses should be bounds-checked using the convenient operator like `array[index]`, whereas the unsafe unchecked version should be something obnoxious and ugly like `unsafe { array.get_unchecked(index) }`. Make the safe way easy and make the unsafe way hard - the exact opposite of C/C++.
* Provide good (and preferably complete) sanitizer tools to check that UB isn't triggered at run time. C/C++ did not have these for the first few decades of their lives, and you were flying blind when triggering UB.