←back to thread

205 points anurag | 4 comments | | HN request time: 0s | source
Show context
shanemhansen ◴[] No.45765342[source]
The unreasonable effectiveness of profiling and digging deep strikes again.
replies(1): >>45776616 #
hinkley ◴[] No.45776616[source]
The biggest tool in the performance toolbox is stubbornness. Without it all the mechanical sympathy in the world will go unexploited.

There’s about a factor of 3 improvement that can be made to most code after the profiler has given up. That probably means there are better profilers than could be written, but in 20 years of having them I’ve only seen 2 that tried. Sadly I think flame graphs made profiling more accessible to the unmotivated but didn’t actually improve overall results.

replies(4): >>45777180 #>>45777265 #>>45777691 #>>45783146 #
jesse__ ◴[] No.45777691[source]
Broadly agree.

I'm curious, what're the profilers you know of that tried to be better? I have a little homebrew game engine with an integrated profiler that I'm always looking for ideas to make more effective.

replies(1): >>45777903 #
1. hinkley ◴[] No.45777903[source]
Clinic.js tried and lost steam. I have a recollection of a profiler called JProfiler that represented space and time as a graph, but also a recollection they went under. And there is a company selling a product of that name that has been around since that time, but doesn’t quite look how I recalled and so I don’t know if I was mistaken about their demise or I’ve swapped product names in my brain. It was 20 years ago which is a long time for mush to happen.

The common element between attempts is new visualizations. And like drawing a projection of an object in a mechanical engineering drawing, there is no one projection that contains the entire description of the problem. You need to present several and let brain synthesize the data missing in each individual projection into an accurate model.

replies(1): >>45783152 #
2. never_inline ◴[] No.45783152[source]
what do you think about speedscope's sandwich view?
replies(1): >>45783753 #
3. hinkley ◴[] No.45783753[source]
More of the same. JetBrains has an equivalent, though it seems to be broken at present. The sandwich keeps dragging you back to the flame graph. Call stack depth has value but width is harder for people to judge and it’s the wrong yardstick for many of the concerns I’ve mentioned in the rest of this thread.

The sandwich view hides invocation count, which is one of the biggest things you need to look at for that remaining 3x.

Also you need to think about budgets. Which is something game designers do and the rest of us ignore. Do I want 10% of overall processing time to be spent accessing reloadable config? Reporting stats? If the answer is no then we need to look at that, even if data retrieval is currently 40% of overall response time and we are trying to get from 2 seconds to 200 ms.

That means config and stats have a budget of 20ms each and you will never hit 200ms if someone doesn’t look at them. So you can pretend like they don’t exist until you get all the other tent poles chopped and then surprise pikachu face when you’ve already painted them into a corner with your other changes.

When we have a lot of shit that all needs to get done, you want to get to transparency, look at the pile and figure out how to do it all effectively. Combine errands and spread the stressful bits out over time. None of the tools and none of the literature supports this exercise, and in fact most of the literature is actively hostile to this exercise. Which is why you should read a certain level of reproval or even contempt in my writing about optimization. It’s very much intended.

Most advice on writing fast code has not materially changed for a time period where the number of calculations we do has increased by 5 orders of magnitude. In every other domain, we re-evaluate our solutions at each order of magnitude. We have marched past ignorant and into insane at this point. We are broken and we have been broken for twenty years.

replies(1): >>45788640 #
4. never_inline ◴[] No.45788640{3}[source]
I would like to know where I can read more in depth about profiling and performance analysis techniques.