Thanks for your response and taking my comments seriously.
> I take issue with the standard ... the original sin is in the standard
Me too! As per my other comments, I think the C and C++ standards are insane with so many cases of UB, and even totally preventable cases like "if a source file doesn't end with newline". Pretty much every other language has far less UB than C and C++.
> Compiler writers aren't blameless
This is probably where we differ. Yes, I acknowledge that compilers became more aggressive over the years. But as someone who writes math proofs, I appreciate the notion of deriving logical consequences of things. For example, if you sneak just a single division-by-zero step, you can prove that 1 = 2, and in turn prove that 0 = 1. I maintain that compilers are maximizing what they can exploit within the bounds of UB; i.e. if it can prove that you triggered UB, then it can do anything it wants.
> Why should I care about that when this machine decided it is oke to irradiate me to lethal levels?
Look, even if you had the friendliest C compiler, there can still be a million other reasons why the machine irradiated you. Maybe there's a manufacturing defect and a wire is flailing around and short circuited something. Maybe you wrote code in assembly language and read an out-of-bounds memory address and simply got back the same value that was most recently on the memory bus (open bus syndrome; a bunch of SNES exploits rely on that); that is UB at a hardware level. And then maybe you dereferenced that value as a pointer.
I am aware that many people are offended by compilers exploiting UB, but that sentiment seems extremely misdirected because they're not willing to confront the fact that the code writer did not comply with the preconditions of the language standard; they wrote faulty code and so the compiler made a faulty executable, GIGO.
> The language standard is the entire reason that C++ is the shit show it is. You are holding up like it's a book handed down by god. It's not.
I am extremely frustrated with C/C++ too ( https://www.nayuki.io/page/undefined-behavior-in-c-and-cplus... , https://www.nayuki.io/page/summary-of-c-cpp-integer-rules ). I fully know that humans wrote that, very fallible humans who chose questionable compromises with respect to performance and compatibility. My other comment ( https://news.ycombinator.com/item?id=43539554 ) already advocates for tightening the standard to reduce as many cases of UB as possible - I even said that I think reading out of bounds should either return an arbitrary value or crash cleanly, which might be more radical than most viewpoints!
> It's an extremely flawed document that has cost the world trillions of dollars and has meaningfully set back the advancement of the human race.
I agree. The C and C++ committees refuse to curb UB, and the rest of us have to deal with the consequences.
> when [compilers] decided what to do on undefined behavior in some cases
Correct, because that's what the C and C++ language standards say. Once the programmer writes code that hits undefined behavior, the standard says that there is no requirement on the compiler and runtime to behave in any certain way. Don't shoot the messenger; the compiler is maximizing its legal exploitation within the language standard. Blame the language standard and petition for change.
> I indeed don't find it controversial that one arbitrary write to memory (like `(char)0x123 = 0x456;`) is sufficient grounds that nobody can predict the unbounded consequences of it. I do find it controversial that this is possible to achieve by default in any modern language.
Sure, a compiler writer like GCC could pledge that they have a dialect of C that is gentler than the standard. They could, like my example above, guarantee that reading out of bounds will simply generate a machine instruction, and let the machine either read some value or page fault, and not infer the fact that the act of reading implies to the rest of the program that the index is in bounds (relating to the `k < 16`, `d[k]` subthread). But let's call it for what it is - it would be a dialect of C, probably only supported by GCC, and not by any other compiler (LLVM, Microsoft, Intel, etc.). It would be an island, not a standard. If they want their semantics to be universally accepted, they will have no choice but to propagate it up to the C standard.
> Programming language designers don't [contend with physical reality], programming languages are literally made up.
There's some truth in that, but it's incomplete. One thing that's not debatable is that programming and math are intertwined - from basic stuff like arithmetic, to inequalities/ranges, to iteration and recursion, to full-blown proofs about behavior. Once you see it in that lens, you realize that certain features lead to contradictions that would make for a very confusing programming language.
In the case of C/C++, this is how I see it: If you have arrays as objects that live in memory, then naturally you have bounds; the array is finite in start and end. If you index that array, you need to decide what happens if the index is out of bounds. You can either go through extensive math proofs to show at compile time that the index is always in bounds, so you can go ahead with no run-time overhead. Otherwise, you either need to pay for run-time checks, or you throw your hands up and say "whatever happens, happens" - and that's a source of UB.
> Because that is approximately the number of professional C++ programmers that are capable of writing undefined free code.
I fully agree with this. In my C & C++ learning journey, I had to learn a lot of awful habits that were taught to me implicitly from other people's writing and code. As the simplest example, the notion that you can overflow a signed integer and then print its value and examine what happened - no, the moment you overflowed it, all bets are off and you cannot debug that in general (unless you have UBSan enabled, which is basically a dialect). Even now that I'm very aware of UB, it is very mentally taxing to remember all the rules and check against them for every line of code that I write. There's a reason I don't use these languages, and use managed ones like Java/JavaScript/Python or Rust, because I don't have enough time and brain cells to write 100% perfect C/C++ code 100% of the time. Based on the difficulties I faced, I have low trust in other people writing correct C & C++ code.
> The C abstract machine is an abomination that should be banished
Sorry, no. If anything, I somewhat appreciate that it calls out the name "abstract machine" instead of implying it through semantics.
If C code can be compiled for x86 and ARM, with or without an MMU, the compiler is reasoning about an abstract machine and not doing all its transformations and optimizations with respect to each concrete machine that it supports. It is literally an abstraction.
Can the abstract machine be tightened up to be less programmer-hostile? Absolutely. Change the language standard.
Even in the case of the Java virtual machine, it is still an abstract machine because there are still a bunch of implementation details that differ somewhat on real JVMs (from different vendors) running on real CPUs. And of course many code optimizations/transformations are first done with with respect to the abstract JVM, and only later refined for the physical machine (e.g. the choice of ADD vs. LEA on x86; specific instructions like POPCNT vs. fallback).
> The whole concept of undefined behavior is something that should never have existed.
Sorry, nope, this is impossible. Look at Rust; even it acknowledges UB, which is a subset of C and C++ UB - https://doc.rust-lang.org/reference/behavior-considered-unde... . Even Java has UB if you use sun.misc.Unsafe, which can be useful for performance-sensitive code, native memory management, building reflective frameworks; also of course when using JNI.
I guess you can fully eliminate UB if you heavily restrict the semantics of the language and/or accept performance reductions. For example, you can definitely make a UB-free version of Brainfuck with its mere 8 instructions (just need to define how you want to handle overflow and the left side of the tape). You can also make UB-free C/C++ if you bounds-check all your writes and keep track of variable lifetimes - actually, someone did that as the Fil-C project, and can compile and run existing code with a small performance penalty.
For better or for worse, C & C++ emphasize backwards compatibility and avoiding runtime costs - to the detriment of almost every other desirable feature like human comprehension. You probably don't like this choice, but it is the tradeoff that they made because they prioritized some features over others.