←back to thread

120 points misternugget | 5 comments | | HN request time: 0.215s | source
1. camel-cdr ◴[] No.42198301[source]
nth_set_bit_u64: wouldn't that be __builtin_ctzll(_pdep_u64(1<<n, v)) with BMI2?
replies(3): >>42198733 #>>42199867 #>>42200581 #
2. SkiFire13 ◴[] No.42198733[source]
That's assuming you're ok with your program not running on some older cpus.
replies(1): >>42200177 #
3. kwillets ◴[] No.42199867[source]
That's my guess as well.

Bitstring rank/select is a well-known problem, and the BMI and non-BMI (Hacker's Delight) versions are available as a reference.

4. zamadatix ◴[] No.42200177[source]
That and that you're not willing to entertain splitting the manual version as #[cfg(not(target_feature = "bmi2"))] fallback implementation. For something already down to ~ 1 ns both of those may well be very reasonable assumptions of course.
5. stouset ◴[] No.42200581[source]
I believe the equivalent ARM64 instructions are in SVE2 which isn’t yet supported on Apple’s M-series chips as of M4, sadly.