←back to thread

238 points hundredwatt | 2 comments | | HN request time: 0.446s | source
Show context
mosselman ◴[] No.42181990[source]
Hyperfine is great. I use it sometimes for some quick web page benchmarks:

https://abuisman.com/posts/developer-tools/quick-page-benchm...

As mentioned here in the thread, when you want to go into the single ms optimisations it is not the best approach since there is a lot of overhead especially the way I demonstrate here, but it works very well for some sanity checks.

replies(2): >>42183124 #>>42183449 #
Sesse__ ◴[] No.42183449[source]
> Hyperfine is great.

Is it, though?

What I would expect a system like this to have, at a minimum:

  * Robust statistics with p-values (not just min/max, compensation for multiple hypotheses, no Gaussian assumptions)
  * Multiple stopping points depending on said statistics.
  * Automatic isolation to the greatest extent possible (given appropriate permissions)
  * Interleaved execution, in case something external changes mid-way.
I don't see any of this in hyperfine. It just… runs things N times and then does a naïve average/min/max? At that rate, one could just as well use a shell script and eyeball the results.
replies(3): >>42183978 #>>42185320 #>>42185894 #
bee_rider ◴[] No.42183978[source]
What do you suggest? Those sound like great features.
replies(1): >>42184064 #
1. Sesse__ ◴[] No.42184064[source]
I've only seen such things in internal tools so far, unfortunately, so if you see anything in public, please tell me :-) I'm just confused why everything thinks hyperfine is so awesome, when it does not meet what I'd consider a fairly low bar for benchmarking tools? (“Best publicly available” != “great”, in my book.)
replies(1): >>42185355 #
2. sharkdp ◴[] No.42185355[source]
> “Best publicly available” != “great”

Of course. But it is free and open source. And everyone is invited to make it better.