←back to thread

386 points ingve | 3 comments | | HN request time: 0.429s | source
1. abainbridge ◴[] No.35739367[source]
How does the benchmarking work here? I always find this kind of micro-benchmarking hard. I feel like I want to see results with and without a preceding cache flush. And with/without clearing of the branch predictor state. Other things I find hard are: 1) ensuring that the CPU is running at full(ish) speed and isn't in a slower-clocked power saving mode for some of the test, 2) effects of code and data alignment can be significant - I want to measure a bunch of different alignments.

Does gtest (that the author used) help with these things? Does anything?

replies(2): >>35739444 #>>35743578 #
2. krona ◴[] No.35739444[source]
Running within a linux cset shield is a fairly standard practice.

For benchmarks reporting times in the range of nanoseconds a common approach is a linear regression of varying batch sizes; I'm not sure gtest does this.

But generally, don't trust any result without a (non-parametric) confidence interval, since the confounding factors like OS jitter, CPU frequency, temperature etc. can't be easily controlled, although some CPU features can be disabled.

3. kccqzy ◴[] No.35743578[source]
According to Intel, for accurate benchmarking you should write a Linux kernel module. And remember to disable preemption and disable interrupts.

https://www.intel.com/content/dam/www/public/us/en/documents...