←back to thread

Beautiful branchless binary search

(probablydance.com)

386 points ingve | 1 comments | 28 Apr 23 05:46 UTC | HN request time: 0.416s | source

Show context

abainbridge ◴[28 Apr 23 09:59 UTC] No.35739367[source]▶

>>35737862 (OP) #

How does the benchmarking work here? I always find this kind of micro-benchmarking hard. I feel like I want to see results with and without a preceding cache flush. And with/without clearing of the branch predictor state. Other things I find hard are: 1) ensuring that the CPU is running at full(ish) speed and isn't in a slower-clocked power saving mode for some of the test, 2) effects of code and data alignment can be significant - I want to measure a bunch of different alignments.

Does gtest (that the author used) help with these things? Does anything?

replies(2): >>35739444 #>>35743578 #

1. krona ◴[28 Apr 23 10:13 UTC] No.35739444[source]▶

Running within a linux cset shield is a fairly standard practice.

For benchmarks reporting times in the range of nanoseconds a common approach is a linear regression of varying batch sizes; I'm not sure gtest does this.

But generally, don't trust any result without a (non-parametric) confidence interval, since the confounding factors like OS jitter, CPU frequency, temperature etc. can't be easily controlled, although some CPU features can be disabled.