←back to thread

170 points judicious | 3 comments | | HN request time: 0.638s | source
1. FooBarBizBazz ◴[] No.45407200[source]
I used to do stuff like this (ok, not half as smart), but stopped around 2013 or so, as the distinction between "implementation defined" behavior (ok) and "undefined" behavior (not ok) started to matter and bite.

After thinking through this carefully, though, I do not see UB (except for signed overflow in a corner case):

Step 1, bit shift.

I understand that, until C++20, left shift of a signed int was UB. But this right shift appears to be implementation defined, which is ok if documented in the code.

Step 2: Add.

Then, (x + mask) is defined behavior (a) for positive x, since then mask=0, and (b) for most negative x, since then mask=-1. However, if x is numeric_limits::lowest, then you get signed integer overflow, which is UB.

Step 3, XOR.

Then the final XOR doesn't have UB AFAICT. It wouldn't be UB as of C++20, when signed integers became two's complement officially. It might have been implementation defined before then, which would be almost as good for something that ubiquitous, but I'm not sure.

Ok, so I think this does not involve UB except for an input of numeric_limits_lowest.

Sound about right?

To fix this, perhaps you would need to make that + an unsigned one?

It bothers me how hard you need to think to do this language lawyering. C++ is a system of rules. Computers interpret rules. The language lawyer should be a piece of software. I get it, not everything can be done statically, so, fine, do it dynamically? UBSan comes closest in practice, but doesn't detect everything. I understand formally modeled versions of C and C++ have been developed commercially, but these are not open source so they effectively do not exist. It's a strange situation.

Just the other day I considered writing something branchless and said, "nah", because of uncertainty around UB. How much performance is being left on the table around the world because of similar thinking, driven by the existence of UB?

Maybe I was supposed to write OCaml or Pascal or Rust.

replies(2): >>45408926 #>>45411183 #
2. teo_zero ◴[] No.45408926[source]
Well, I would say that implementation defined is ok only if you have full control on the full compilation process. If your code aims at universality you should find better tricks.

The UB on the add happens in cases where all incarnations of abs() would fail as well, because there simply isn't a correct return value.

3. wucke13 ◴[] No.45411183[source]
Rust in particular with miri is quite impressive at catching them. You just run your testcases via

    cargo miri run
And if your code actually touches UB, mirei will most likely point out exactly where and why.