Beautiful branchless binary search

(probablydance.com)

386 points ingve | 2 comments | 28 Apr 23 05:46 UTC | HN request time: 0.459s | source

Show context

2102922286 ◴[28 Apr 23 06:24 UTC] No.35738100[source]▶

A cool related algorithm is https://algorithmica.org/en/eytzinger

In addition to being branchless, it also has better cache properties than a standard binary search tree.

If you're doing a binary search in an array, you start in the middle, and then jump all the way to the midway point of one half, and so on. As a result, there's a lot of distance between each read that you do in the array. Thus, each read is putting extra strain on the cache (or the reads won't be cached).

The CPU cache performs much better if the values you want to read are close to each other. Enter the Eytzinger Binary Search. The idea is that, if you're always going to be accessing the root of the tree, and then one of its two children--you should just put those children physically close to the root, so they'll all be on the same cache line!

replies(8): >>35738287 #>>35738451 #>>35738561 #>>35739155 #>>35740049 #>>35740397 #>>35740690 #>>35744101 #

srcreigh ◴[28 Apr 23 09:21 UTC] No.35739155[source]▶

>>35738100 #

Pretty cool stuff. The algorithm returns the largest item that’s at most the size of x. At each step it checks array[k] and then checks either 2k or 2k+1 next.

The last k will be past the end of the array so you have to backtrack to discover the actual lower bound.

k records it’s turns in binary such as 10111. After another right turn it’s 101111. So to backtrack, you strip off the trailing 1s.

How to do that though? There’s an assembly instruction called ffs that will give you the first bit set from the right side. So invert k to get 010000, ffs gives you 4. So for k 101111, shift forward by 4, you get 10. Cancel the right turns, now you have the index of the lower bound.

replies(3): >>35739267 #>>35740458 #>>35740555 #

1. contravariant ◴[28 Apr 23 09:41 UTC] No.35739267[source]▶

>>35739155 #

Don't you just need to remove the trailing 1s? In that case it's just `x & (x+1)`

replies(1): >>35740636 #

2. srcreigh ◴[28 Apr 23 12:47 UTC] No.35740636[source]▶

>>35739267 (TP) #

No, that would just set trailing 1s to 0. We want to trim them.

The idea behind trimming trailing 1s is that they represent a sub tree that is all smaller than the lower bound. By trimming those 1s, we traverse back up the tree to find the smallest number which is larger than or equal to the lower bound (aka the answer).

↑