(hueffner.de)

434 points gnabgib | 1 comments | 15 May 25 21:57 UTC | HN request time: 0.21s | source

Show context

captaincrunch ◴[15 May 25 22:46 UTC] No.44000123[source]▶

This is fast, READABLE, and accurate:

bool is_leap_year(uint32_t y) { // Works for Gregorian years in range [0, 65535] return ((!(y & 3)) && ((y % 25 != 0) || !(y & 15))); }

replies(3): >>44000304 #>>44000522 #>>44003153 #

andrepd ◴[15 May 25 23:12 UTC] No.44000304[source]▶

>>44000123 #

This impl is mentioned in TFA.. It's much slower and includes branches.

replies(1): >>44000495 #

hoten ◴[15 May 25 23:52 UTC] No.44000495[source]▶

>>44000304 #

I'd expect even without optimizations on, there wouldn't be branches in the output for that code.

replies(1): >>44000544 #

kragen ◴[16 May 25 00:01 UTC] No.44000544[source]▶

>>44000495 #

There are, even with optimizations on. You could have checked: https://godbolt.org/#g:!((g:!((g:!((h:codeEditor,i:(filename...

I didn't find any way to get a compiler to generate a branchless version. I tried clang and GCC, both for amd64, with -O0, -O5, -Os, and for clang, -Oz.

replies(1): >>44000615 #

mmozeiko ◴[16 May 25 00:14 UTC] No.44000615[source]▶

>>44000544 #

If you change logic and/or to bitwise and/or then it'll be branchless.

replies(1): >>44014827 #

kragen ◴[17 May 25 15:01 UTC] No.44014827[source]▶

>>44000615 #

True: https://godbolt.org/#g:!((g:!((g:!((h:codeEditor,i:(filename... but I understood hoten to be saying that compilers would generally produce that version from the short-circuiting version, and they don't.

replies(1): >>44017765 #

hoten ◴[17 May 25 23:29 UTC] No.44017765[source]▶

>>44014827 #

Yeah I was wrong.

Do we know why the compiler doesn't do it? Surely the output is the same and avoiding branches is clearly faster.

Maybe short circuiting requires such an optimization not be made?

replies(1): >>44028496 #

1. kragen ◴[19 May 25 11:09 UTC] No.44028496[source]▶

>>44017765 #

There are cases where the optimization wouldn't be safe (like i < n && a[i] != k) but this is not one of them. Maybe the compiler is just dum. Or maybe avoiding branches is not clearly faster in cases like this? Have you measured this particular case?

↑

A leap year check in three instructions