←back to thread

205 points anurag | 2 comments | | HN request time: 0.419s | source
Show context
shanemhansen ◴[] No.45765342[source]
The unreasonable effectiveness of profiling and digging deep strikes again.
replies(1): >>45776616 #
hinkley ◴[] No.45776616[source]
The biggest tool in the performance toolbox is stubbornness. Without it all the mechanical sympathy in the world will go unexploited.

There’s about a factor of 3 improvement that can be made to most code after the profiler has given up. That probably means there are better profilers than could be written, but in 20 years of having them I’ve only seen 2 that tried. Sadly I think flame graphs made profiling more accessible to the unmotivated but didn’t actually improve overall results.

replies(4): >>45777180 #>>45777265 #>>45777691 #>>45783146 #
Negitivefrags ◴[] No.45777265[source]
I think the biggest tool is higher expectations. Most programmers really haven't come to grips with the idea that computers are fast.

If you see a database query that takes 1 hour to run, and only touches a few gb of data, you should be thinking "Well nvme bandwidth is multiple gigabytes per second, why can't it run in 1 second or less?"

The idea that anyone would accept a request to a website taking longer than 30ms, (the time it takes for a game to render it's entire world including both the CPU and GPU parts at 60fps) is insane, and nobody should really accept it, but we commonly do.

replies(4): >>45777574 #>>45777649 #>>45777878 #>>45779600 #
1. mjevans ◴[] No.45779600[source]
30mS for a website is a tough bar to clear considering Speed of Light (or rather electrons in copper / light in fiber)

https://en.wikipedia.org/wiki/Speed_of_light

Just as an example, round trip delay from where I rent to the local backbone is about 14mS alone, and the average for a webserver is 53mS. Just as a simple echo reply. (I picked it because I'd hoped that was in Redmond or some nearby datacenter, but it looks more likely to be in a cheaper labor area.)

However it's only the bloated ECMAScript (javascript) trash web of today that makes a website take longer than ~1 second to load on a modern PC. Plain old HTML, images on a reasonable diet, and some script elements only for interactive things can scream.

    mtr -bzw microsoft.com
    6. AS7922        be-36131-cs03.seattle.wa.ibone.comcast.net (2001:558:3:942::1)         0.0%    10   12.9  13.9  11.5  18.7   2.6
    7. AS7922        be-2311-pe11.seattle.wa.ibone.comcast.net (2001:558:3:3a::2)           0.0%    10   11.8  13.3  10.6  17.2   2.4
    8. AS7922        2001:559:0:80::101e                                                    0.0%    10   15.2  20.7  10.7  60.0  17.3
    9. AS8075        ae25-0.icr02.mwh01.ntwk.msn.net (2a01:111:2000:2:8000::b9a)            0.0%    10   41.1  23.7  14.8  41.9  10.4
    10. AS8075        be140.ibr03.mwh01.ntwk.msn.net (2603:1060:0:12::f18e)                  0.0%    10   53.1  53.1  50.2  57.4   2.1
    11. AS8075        2603:1060:0:10::f536                                                   0.0%    10   82.1  55.7  50.5  82.1   9.7
    12. AS8075        2603:1060:0:10::f3b1                                                   0.0%    10   54.4  96.6  50.4 147.4  32.5
    13. AS8075        2603:1060:0:10::f51a                                                   0.0%    10   49.7  55.3  49.7  78.4   8.3
    14. AS8075        2a01:111:201:f200::d9d                                                 0.0%    10   52.7  53.2  50.2  58.1   2.7
    15. AS8075        2a01:111:2000:6::4a51                                                  0.0%    10   49.4  51.6  49.4  54.1   1.7
    20. AS8075        2603:1030:b:3::152                                                     0.0%    10   50.7  53.4  49.2  60.7   4.2
replies(1): >>45783863 #
2. hinkley ◴[] No.45783863[source]
In the cloud era this gets a bit better but my last job I removed a single service that was adding 30ms to response time and replaced it with a consul lookup with a watch on it. It wasn’t even a big service. Same DC, very simple graph query with a very small response. You can burn through 30 ms without half trying.