←back to thread

232 points pjmlp | 1 comments | | HN request time: 0.206s | source
Show context
derriz ◴[] No.43534525[source]
Sane defaults should be table stakes for toolchains but C++ has "history".

All significant C++ code-bases and projects I've worked on have had 10s of lines (if not screens) of compiler and linker options - a maintenance nightmare particularly with stuff related to optimization. This stuff is so brittle, who knows when (with which release of the compiler or linker) a particular combination of optimization flags were actually beneficial? How do you regression test this stuff? So everyone is afraid to touch this stuff.

Other compiled languages have similar issues but none to the extent of C++ that I've experienced.

replies(4): >>43534781 #>>43535229 #>>43535747 #>>43543362 #
duped ◴[] No.43535229[source]
I mean if you emit compiler commands from any build system they're going to be completely illegible due to the number of -L,-l,-I,-i,-D flags which are mostly generated by things like pkg-config and your build configuration.

There's not many optimization flags that people get fine grained with, the exception being floating point because -ffast-math alone is extremely inadvisable

replies(2): >>43535861 #>>43547656 #
dapperdrake ◴[] No.43535861[source]
-ffast-math and -Ofast are inadvisable on principle:

Tl;dr: python gevent messes up your x87 float registers (yes.)

https://moyix.blogspot.com/2022/09/someones-been-messing-wit...

replies(2): >>43535920 #>>43542503 #
duped ◴[] No.43535920[source]
I disagree with "on principle." There are flaws in the design of IEEE 754 and omitting strict adherence for the purposes of performance is fine, if not required for some applications.

For example, recursive filters (even the humble averaging filter) will suffer untold pain without enabling DAZ/FTZ mode.

fwiw the linked issue has been remedied in recent compilers and isn't a python problem, it's a gcc problem. Even that said, if your algorithm requires subnormal numbers, for the love of numeric stability, guard your scopes and set the mxcsr register accordingly!

replies(3): >>43536009 #>>43536613 #>>43538502 #
dapperdrake ◴[] No.43536009[source]
In practice, "some applications" seems to include almost all of NumPy and Python. Good call.

Like with the Java sin() fixes: if you don't care about the results being correct why not constant-fold an arbitrary number? Way faster at run-time.

replies(1): >>43536528 #
duped ◴[] No.43536528[source]
All numerical methods define "correct" to be within a range or to some precision. There are very few algorithms that require FTZ mode to be "correct" - the linked article and the article it links don't even have an example (there are good examples of where say, -ffinite-math is super dangerous, because inf/NaNs are way more common than arithmetic on subnormal numbers).

And yea, the fact that crt1.o being linked into shared libraries fucking up the precision of some computations depending on library dependencies (and the order they're loaded!) was bad.. but it lingered in the entire Linux ecosystem for over a decade. So how bad was it, if it took that long to notice?

If you have a numerical algorithm that requires subnormal arithmetic to converge, a) don't that's super shaky, b) set/unset mxcsr at the top/bottom of your function and ensure you never unwind the stack without resetting it. It's preserved across context switches so you're not going to get blown away by the OS scheduler.

This isn't practical numerical methods in C 101 but it's at least 201. In practice you don't trust floats for bit exact math. Use different types for that.

replies(1): >>43537192 #
1. dapperdrake ◴[] No.43537192[source]
IEEE 754 defaults are for people who don't get deeply into numerical analysis and Cauchy sequences. Like, ostensibly, most FOSS maintainers. Or most people who write software in general.

There are people that do. HPC and the demoscene have numerous examples. Most of the people I met here are capable of reading gcc's manual and picking the optimizations they actually need. And they know how to debug this stuff.

If it's not obvious who gcc's defaults should cater to, then redefine human-friendly until it becomes obvious.