←back to thread

92 points endorphine | 3 comments | | HN request time: 0.576s | source
Show context
pcwalton ◴[] No.43537392[source]
I was disappointed that Russ didn't mention the strongest argument for making arithmetic overflow UB. It's a subtle thing that has to do with sign extension and loops. The best explanation is given by ryg here [1].

As a summary: The most common way given in C textbooks to iterate over an array is "for (int i = 0; i < n; i++) { ... array[i] ... }". The problem comes from these three facts: (1) i is a signed integer; (2) i is 32-bit; (3) pointers nowadays are usually 64-bit. That means that a compiler that can't prove that the increment on "i" won't overflow (perhaps because "n" was passed in as a function parameter) has to do a sign extend on every loop iteration, which adds extra instructions in what could be a hot loop, especially since you can't fold a sign extending index into an addressing mode on x86. Since this pattern is so common, compiler developers are loath to change the semantics here--even a 0.1% fleet-wide slowdown has a cost to FAANG measured in the millions.

Note that the problem goes away if you use pointer-width indices for arrays, which many other languages do. It also goes away if you use C++ iterators. Sadly, the C-like pattern persists.

[1]: https://gist.github.com/rygorous/e0f055bfb74e3d5f0af20690759...

replies(6): >>43537702 #>>43537771 #>>43537976 #>>43538026 #>>43538237 #>>43538348 #
dcrazy ◴[] No.43537771[source]
The C language does not specify that `int` is 32-bits. That is a choice made by compiler developers to make compiling non-portable code written for 32-bit platforms easier, because most codebases wind up baking in assumptions about variable sizes.

In Swift, for example, `Int` is 64 bits wide on 64-bit targets. If we ever move to 128-bit CPUs, the Swift project will be forced to decide to stick to their guns or make `Int` 64-bits on 128-bit targets.

replies(1): >>43537843 #
pcwalton ◴[] No.43537843[source]
> The C language does not specify that `int` is 32-bits. That is a choice made by compiler developers to make compiling non-portable code written for 32-bit platforms easier, because most codebases wind up baking in assumptions about variable sizes.

Making int 32-bit also results in not-insignificant memory savings.

replies(1): >>43538095 #
bobmcnamara ◴[] No.43538095[source]
And even wastes cycles on 16bit size_t MCUs.
replies(2): >>43538244 #>>43538284 #
1. moefh ◴[] No.43538244[source]
Is there any MCU where `size_t` is 16 bits but `int` is 32 bits? I'm genuinely curious, I have never seen one.
replies(2): >>43538605 #>>43541701 #
2. dcrazy ◴[] No.43538605[source]
Me either, but it wouldn’t be unreasonable if the target has 32-bit ALUs but only 16 address lines and no MMU.
3. bobmcnamara ◴[] No.43541701[source]
The original 32-bit machine, the Manchester Baby, would've likely had a 32-bit int, but with only 32 words of RAM, C would be rather limited, though static-stack implementations would work.