←back to thread

1764 points fatihky | 1 comments | | HN request time: 0s | source
Show context
jblow ◴[] No.12701702[source]
#9 is especially stupid because it's so context-dependent. SSE4 gives you a popcount instruction, for example, which would be easily the fastest way to do this, if available.
replies(2): >>12702641 #>>12702679 #
greyman ◴[] No.12702679[source]
Yes, but without that instruction, the algorithm mentioned by recruiter is really the quickest way. I coded chess algorithm in the past and this was exactly the method top chess open-source engines used. But imho it is hard to figure that out without prior experience with this problem.
replies(3): >>12703390 #>>12703834 #>>12708561 #
monocasa ◴[] No.12703390[source]
Yep. Went and tried the lookup method against a 5 step parallel shift and add method (which is the fastest bitwise way I know of without, and the lookup is ~5% faster than the bitwise way.

https://gist.github.com/monocasa/1d44a03cbd0170bfffc6a4a5c37...

replies(1): >>12703823 #
gcp ◴[] No.12703823[source]
Your code has 6 shifts, 6 adds/subs and 6 ANDs.

You can do it with 4 shifts, 3 adds, 1 MUL and 4 ANDs.

Your code is simply suboptimal.

replies(1): >>12704051 #
1. monocasa ◴[] No.12704051[source]
For a 64bit quantity? I'm curious to see your algorithm in actual code.