A leap year check in three instructions

(hueffner.de)

Show context

qingcharles ◴[15 May 25 22:22 UTC] No.43999955[source]▶

I love these incomprehensible magic number optimizations. Every time I see one I wonder how many optimizations like this we missed back in the old days when we were writing all our inner loops in assembly?

Does anyone have a collection of these things?

replies(4): >>43999992 #>>44000134 #>>44000173 #>>44000633 #

ryao ◴[16 May 25 00:18 UTC] No.44000633[source]▶

>>43999955 #

Here is a short list:

https://graphics.stanford.edu/~seander/bithacks.html

It is not on the list, but #define CMP(X, Y) (((X) > (Y)) - ((X) < (Y))) is an efficient way to do generic comparisons for things that want UNIX-style comparators. If you compare the output against 0 to check for some form of greater than, less than or equality, the compiler should automatically simplify it. For example, CMP(X, Y) > 0 is simplified to (X > Y) by a compiler.

The signum(x) function that is equivalent to CMP(X, 0) can be done in 3 or 4 instructions depending on your architecture without any comparison operations:

https://www.cs.cornell.edu/courses/cs6120/2022sp/blog/supero...

It is such a famous example, that compilers probably optimize CMP(X, 0) to that, but I have not checked. Coincidentally, the expansion of CMP(X, 0) is on the bit hacks list.

There are a few more superoptimized mathematical operations listed here:

https://www2.cs.arizona.edu/~collberg/Teaching/553/2011/Reso...

Note that the assembly code appears to be for the Motorola 68000 processor and it makes use of flags that are set in edge cases to work.

Finally, there is a list of helpful macros for bit operations that originated in OpenSolaris (as far as I know) here:

https://github.com/freebsd/freebsd-src/blob/master/sys/cddl/...

There used to be an Open Solaris blog post on them, but Oracle has taken it down.

Enjoy!

replies(3): >>44001468 #>>44002802 #>>44002875 #

1. eru ◴[16 May 25 07:49 UTC] No.44002802[source]▶

>>44000633 #

> It is not on the list, but #define CMP(X, Y) (((X) > (Y)) - ((X) < (Y))) is an efficient way to do generic comparisons for things that want UNIX-style comparators. If you compare the output against 0 to check for some form of greater than, less than or equality, the compiler should automatically simplify it. For example, CMP(X, Y) > 0 is simplified to (X > Y) by a compiler.

I guess this only applies when the compiler knows what version of > you are using?

Eg it might not work in C++ when < and > are overloaded for eg strings?

replies(3): >>44003877 #>>44008285 #>>44008439 #

2. trollbridge ◴[16 May 25 11:00 UTC] No.44003877[source]▶

>>44002802 (TP) #

The compiler would resolve that before the optimiser.

replies(1): >>44011881 #

3. ◴[16 May 25 18:08 UTC] No.44008285[source]▶

>>44002802 (TP) #

4. ryao ◴[16 May 25 18:25 UTC] No.44008439[source]▶

>>44002802 (TP) #

My comment had been meant for C, but it should apply to C++ too even when operator overloading is used, provided the comparisons are simple and inlined. If you add overloads for the > and < operators in your string example to a place where they would inline, and the overload compares .length(), this should simplify. For example, godbolt shows that CMP(X, Y) == 0 is optimized to one mov instruction and one cmp instruction despite operator overloads when I implement your string example:

https://godbolt.org/z/nGbPhz86q

If you did not inline the operator overloads and had them in another compilation unit, do not expect this to simplify (unless you use LTO).

If you have compound comparators in the operator overloads (such that on equality in one field, it considers a second for a tie breaker), I would not expect it to simplify, although the compiler could surprise me.

replies(1): >>44011879 #

5. eru ◴[17 May 25 03:42 UTC] No.44011879[source]▶

>>44008439 #

I was more thinking of eg lexicographic comparisons of strings, not just comparing by length.

Yes, if you have a smart enough compiler, or a simple enough comparison, this will simplify.

replies(1): >>44023696 #

6. eru ◴[17 May 25 03:43 UTC] No.44011881[source]▶

>>44003877 #

I'd like to see that resolved for eg lexicographic comparison of strings.

7. ryao ◴[18 May 25 19:18 UTC] No.44023696{3}[source]▶

>>44011879 #

You could use CMP(A, B) as part of your lexographic comparison and then have it output the result of the first non-zero result (unless you find both strings are equal, in which case, you would output zero) when comparing characters.

If you implement the operators, you can use CMP(A, B) to turn it into a three value output, since it works solely using Boolean logic, but I would be surprised if it simplified. I am half prepared to be surprised since the compiler might do some CSE after inlining and then do some other transformation. That said, you really only want to use CMP(A, B) for numerical comparisons.

replies(1): >>44026955 #

8. eru ◴[19 May 25 06:22 UTC] No.44026955{4}[source]▶

>>44023696 #

Yes, you can definitely manually define it. I was talking about what we can reasonably expect the compiler to figure out on its own.

↑