←back to thread

92 points endorphine | 4 comments | | HN request time: 0.887s | source
Show context
pcwalton ◴[] No.43537392[source]
I was disappointed that Russ didn't mention the strongest argument for making arithmetic overflow UB. It's a subtle thing that has to do with sign extension and loops. The best explanation is given by ryg here [1].

As a summary: The most common way given in C textbooks to iterate over an array is "for (int i = 0; i < n; i++) { ... array[i] ... }". The problem comes from these three facts: (1) i is a signed integer; (2) i is 32-bit; (3) pointers nowadays are usually 64-bit. That means that a compiler that can't prove that the increment on "i" won't overflow (perhaps because "n" was passed in as a function parameter) has to do a sign extend on every loop iteration, which adds extra instructions in what could be a hot loop, especially since you can't fold a sign extending index into an addressing mode on x86. Since this pattern is so common, compiler developers are loath to change the semantics here--even a 0.1% fleet-wide slowdown has a cost to FAANG measured in the millions.

Note that the problem goes away if you use pointer-width indices for arrays, which many other languages do. It also goes away if you use C++ iterators. Sadly, the C-like pattern persists.

[1]: https://gist.github.com/rygorous/e0f055bfb74e3d5f0af20690759...

replies(6): >>43537702 #>>43537771 #>>43537976 #>>43538026 #>>43538237 #>>43538348 #
1. tmoravec ◴[] No.43537976[source]
size_t has been in the C standard since C89. "for (int i = 0..." might have it's uses so it doesn't make sense to disallow it. But I'd argue that it's not really a common textbook way to iterate over an array.
replies(1): >>43537997 #
2. pcwalton ◴[] No.43537997[source]
The first example program that demonstrates arrays in The C Programming Language 2nd edition (page 22) uses signed integers for both the induction variable and the array length (the literal 10 becomes int).
replies(2): >>43538148 #>>43539066 #
3. frumplestlatz ◴[] No.43538148[source]
The language has evolved significantly, and we’ve learned a lot about how to write safer C, since that was published in 1988.
4. Maxatar ◴[] No.43539066[source]
From what I see, that book was published in 1988.