How We Found 7 TiB of Memory Just Sitting Around

(render.com)

205 points anurag | 2 comments | 30 Oct 25 18:25 UTC | HN request time: 0s | source

Show context

shanemhansen ◴[30 Oct 25 21:05 UTC] No.45765342[source]▶

>>45763359 (OP) #

The unreasonable effectiveness of profiling and digging deep strikes again.

replies(1): >>45776616 #

hinkley ◴[31 Oct 25 20:58 UTC] No.45776616[source]▶

>>45765342 #

The biggest tool in the performance toolbox is stubbornness. Without it all the mechanical sympathy in the world will go unexploited.

There’s about a factor of 3 improvement that can be made to most code after the profiler has given up. That probably means there are better profilers than could be written, but in 20 years of having them I’ve only seen 2 that tried. Sadly I think flame graphs made profiling more accessible to the unmotivated but didn’t actually improve overall results.

replies(4): >>45777180 #>>45777265 #>>45777691 #>>45783146 #

Negitivefrags ◴[31 Oct 25 22:11 UTC] No.45777265[source]▶

>>45776616 #

I think the biggest tool is higher expectations. Most programmers really haven't come to grips with the idea that computers are fast.

If you see a database query that takes 1 hour to run, and only touches a few gb of data, you should be thinking "Well nvme bandwidth is multiple gigabytes per second, why can't it run in 1 second or less?"

The idea that anyone would accept a request to a website taking longer than 30ms, (the time it takes for a game to render it's entire world including both the CPU and GPU parts at 60fps) is insane, and nobody should really accept it, but we commonly do.

replies(4): >>45777574 #>>45777649 #>>45777878 #>>45779600 #

javier2 ◴[31 Oct 25 22:50 UTC] No.45777574[source]▶

>>45777265 #

its also about cost. My game computer has 8 cores + 1 expensive gpu + 32GB ram for me alone. We dont have that per customer.

replies(3): >>45777680 #>>45777764 #>>45778893 #

avidiax ◴[31 Oct 25 23:02 UTC] No.45777680[source]▶

>>45777574 #

It's also about revenue.

Uber could run the complete global rider/driver flow from a single server.

It doesn't, in part because all of those individual trips earn $1 or more each, so it's perfectly acceptable to the business to be more more inefficient and use hundreds of servers for this task.

Similarly, a small website taking 150ms to render the page only matters if the lost productivity costs less than the engineering time to fix it, and even then, only makes sense if that engineering time isn't more productively used to add features or reliability.

replies(2): >>45779777 #>>45783407 #

onethumb ◴[01 Nov 25 07:09 UTC] No.45779777[source]▶

>>45777680 #

Uber could not run the complete global rider/driver flow from a single server.

replies(2): >>45780373 #>>45783170 #

1. exe34 ◴[01 Nov 25 09:33 UTC] No.45780373[source]▶

>>45779777 #

I believe the argument was that somebody competent could do it.

replies(1): >>45787003 #

2. lazide ◴[02 Nov 25 01:02 UTC] No.45787003[source]▶

>>45780373 (TP) #

The reality is that, no, that is not possible. If a single core can render and return a web page in 16ms, what do you do when you have a million requests/sec?

The reality is most of those requests (now) get mixed in with a firehose of traffic, and could be served much faster than 16ms if that is all that was going on. But it’s never all that is going on.

↑