←back to thread

92 points endorphine | 2 comments | | HN request time: 0.4s | source
Show context
pcwalton ◴[] No.43537392[source]
I was disappointed that Russ didn't mention the strongest argument for making arithmetic overflow UB. It's a subtle thing that has to do with sign extension and loops. The best explanation is given by ryg here [1].

As a summary: The most common way given in C textbooks to iterate over an array is "for (int i = 0; i < n; i++) { ... array[i] ... }". The problem comes from these three facts: (1) i is a signed integer; (2) i is 32-bit; (3) pointers nowadays are usually 64-bit. That means that a compiler that can't prove that the increment on "i" won't overflow (perhaps because "n" was passed in as a function parameter) has to do a sign extend on every loop iteration, which adds extra instructions in what could be a hot loop, especially since you can't fold a sign extending index into an addressing mode on x86. Since this pattern is so common, compiler developers are loath to change the semantics here--even a 0.1% fleet-wide slowdown has a cost to FAANG measured in the millions.

Note that the problem goes away if you use pointer-width indices for arrays, which many other languages do. It also goes away if you use C++ iterators. Sadly, the C-like pattern persists.

[1]: https://gist.github.com/rygorous/e0f055bfb74e3d5f0af20690759...

replies(6): >>43537702 #>>43537771 #>>43537976 #>>43538026 #>>43538237 #>>43538348 #
AlotOfReading ◴[] No.43537702[source]
There's half a dozen better ways that could have been addressed anytime in the past decade.

Anything from making it implementation defined to unspecified behavior to just throwing a diagnostic warning or having a clang-tidy performance rule.

I'm also incredibly suspicious of the idea that FAANG in particular won't accept minor compiler slowdowns for useful safety. Google and Apple for example have both talked publicly about how they're pushing bounds checking by default internally and you can see that in the Apple Buffer hardening RFC and the Abseil hardening modes.

replies(1): >>43537833 #
pcwalton ◴[] No.43537833[source]
> Anything from making it implementation defined to unspecified behavior to just throwing a diagnostic warning or having a clang-tidy performance rule.

To be clear, you're proposing putting a warning on "for (int i = 0; i < n; i++)"? The most common textbook way to write a loop in C?

> I'm also incredibly suspicious of the idea that FAANG in particular won't accept minor compiler slowdowns for useful safety.

I worked on compilers at FAANG for quite a while and know quite well how these teams justify their existence. Telling executives "we cost the company $1M a quarter, but good news, we made the semantics of the language easier for programming language nerds to understand" instead of "we saved the company $10M last quarter" is an excellent strategy for getting the team axed next time downsizing comes around.

replies(4): >>43538103 #>>43538139 #>>43538208 #>>43551655 #
frumplestlatz ◴[] No.43538103[source]
Even if it is the most common method in text books (I’m not sure that’s true), it’s also almost always wrong. The index must always be sized to fit what you’re indexing over.

As for your compiler statement — yes. At least at Apple, there is ongoing clang compiler work, focused on security, that actively makes things slower, and there has been for years.

replies(1): >>43538414 #
1. pcwalton ◴[] No.43538414[source]
> Even if it is the most common method in text books (I’m not sure that’s true), it’s also almost always wrong. The index must always be sized to fit what you’re indexing over.

"The code was wrong, so it was OK that I made it slower" is a message board argument, not a business argument.

> As for your compiler statement — yes. At least at Apple, there is ongoing clang compiler work, focused on security, that actively makes things slower, and there has been for years.

The performance of code that runs on consumer devices has less of a measurable economic impact than that of code that runs on the server.

replies(1): >>43574658 #
2. frumplestlatz ◴[] No.43574658[source]
> “The code was wrong, so it was OK that I made it slower" is a message board argument, not a business argument.

And yet, here I am, in a business, watching that argument play out in realtime.

> The performance of code that runs on consumer devices has less of a measurable economic impact than that of code that runs on the server.

The performance of the device literally in the entire world’s hands every day is arguably quite a bit more impactful that Facebook having to buy more servers.