←back to thread

103 points vortex_ape | 2 comments | | HN request time: 0.426s | source
Show context
xeeeeeeeeeeenu ◴[] No.42742731[source]
> So on x86_64 processors, we have to branch to say “a 32-bit zero value has 32 leading zeros”.

Not if you're targeting x86-64-v3 or higher. Haswell (Intel) and Piledriver (AMD) introduced the LZCNT instruction that doesn't have this problem.

replies(2): >>42742835 #>>42742859 #
pklausler ◴[] No.42742859[source]
Easy to count leading zeroes in a branch-free manner without a hardware instruction using a conditional move and a de Bruijn sequence; see https://github.com/llvm/llvm-project/blob/main/flang/include... .
replies(1): >>42744263 #
1. hinkley ◴[] No.42744263[source]

    x |= x >> 1;
    x |= x >> 2;
    x |= x >> 4;
    x |= x >> 8;
    x |= x >> 16;
    x |= x >> 32;
Isn't there another way to do this without so many data races?

I feel like this should be

   x |= x >> 1 | x >> ??? ...
replies(1): >>42744484 #
2. gpderetta ◴[] No.42744484[source]
By data races I assume you actually mean data dependencies?