How We Found 7 TiB of Memory Just Sitting Around

(render.com)

Show context

shanemhansen ◴[30 Oct 25 21:05 UTC] No.45765342[source]▶

>>45763359 (OP) #

The unreasonable effectiveness of profiling and digging deep strikes again.

replies(1): >>45776616 #

hinkley ◴[31 Oct 25 20:58 UTC] No.45776616[source]▶

>>45765342 #

The biggest tool in the performance toolbox is stubbornness. Without it all the mechanical sympathy in the world will go unexploited.

There’s about a factor of 3 improvement that can be made to most code after the profiler has given up. That probably means there are better profilers than could be written, but in 20 years of having them I’ve only seen 2 that tried. Sadly I think flame graphs made profiling more accessible to the unmotivated but didn’t actually improve overall results.

replies(4): >>45777180 #>>45777265 #>>45777691 #>>45783146 #

Negitivefrags ◴[31 Oct 25 22:11 UTC] No.45777265[source]▶

>>45776616 #

I think the biggest tool is higher expectations. Most programmers really haven't come to grips with the idea that computers are fast.

If you see a database query that takes 1 hour to run, and only touches a few gb of data, you should be thinking "Well nvme bandwidth is multiple gigabytes per second, why can't it run in 1 second or less?"

The idea that anyone would accept a request to a website taking longer than 30ms, (the time it takes for a game to render it's entire world including both the CPU and GPU parts at 60fps) is insane, and nobody should really accept it, but we commonly do.

replies(4): >>45777574 #>>45777649 #>>45777878 #>>45779600 #

javier2 ◴[31 Oct 25 22:50 UTC] No.45777574[source]▶

>>45777265 #

its also about cost. My game computer has 8 cores + 1 expensive gpu + 32GB ram for me alone. We dont have that per customer.

replies(3): >>45777680 #>>45777764 #>>45778893 #

1. avidiax ◴[31 Oct 25 23:02 UTC] No.45777680[source]▶

>>45777574 #

It's also about revenue.

Uber could run the complete global rider/driver flow from a single server.

It doesn't, in part because all of those individual trips earn $1 or more each, so it's perfectly acceptable to the business to be more more inefficient and use hundreds of servers for this task.

Similarly, a small website taking 150ms to render the page only matters if the lost productivity costs less than the engineering time to fix it, and even then, only makes sense if that engineering time isn't more productively used to add features or reliability.

replies(2): >>45779777 #>>45783407 #

2. onethumb ◴[01 Nov 25 07:09 UTC] No.45779777[source]▶

>>45777680 (TP) #

Uber could not run the complete global rider/driver flow from a single server.

replies(2): >>45780373 #>>45783170 #

3. exe34 ◴[01 Nov 25 09:33 UTC] No.45780373[source]▶

>>45779777 #

I believe the argument was that somebody competent could do it.

replies(1): >>45787003 #

4. avidiax ◴[01 Nov 25 16:50 UTC] No.45783170[source]▶

>>45779777 #

I'm saying you can keep track of all the riders and drivers, matchmake, start/progress/complete trips, with a single server, for the entire world.

Billing, serving assets like map tiles, etc. not included.

Some key things to understand:

* The scale of Uber is not that high. A big city surely has < 10,000 drivers simultaneously, probably less than 1,000.

* The driver and rider phones participate in the state keeping. They send updates every 4 seconds, but they only have to be online to start a trip. Both mobiles cache a trip log that gets uploaded when network is available.

* Since driver/rider send updates every 4 seconds, and since you don't need to be online to continue or end a trip, you don't even need an active spare for the server. A hot spare can rebuild the world state in 4 seconds. State for a rider and driver is just a few bytes each for id, position and status.

* Since you'll have the rider and driver trip logs from their phones, you don't necessarily have to log the ride server side either. Its also OK to lose a little data on the server. You can use UDP.

Don't forget that in the olden times, all the taxis in a city like New York were dispatched by humans. All the police in the city were dispatched by humans. You can replace a building of dispatchers with a good server and mobile hardware working together.

replies(1): >>45783447 #

5. hinkley ◴[01 Nov 25 17:17 UTC] No.45783407[source]▶

>>45777680 (TP) #

Practically, you have to parcel out points of contention to a larger and larger team to stop them from spending 30 hours a week just coordinating for changes to the servers. So the servers divide to follow Conway’s Law, or the company goes bankrupt (why not both?).

Microservices try to fix that. But then you need bin packing so microservices beget kubernetes.

6. hinkley ◴[01 Nov 25 17:22 UTC] No.45783447{3}[source]▶

>>45783170 #

You could envision a system that used one server per county and that’s 3k servers. Combine rural counties to get that down to 1000, and that’s probably less servers than uber runs.

What the internet will tell me is that uber has 4500 distinct services, which is more services than there are counties in the US.

7. lazide ◴[02 Nov 25 01:02 UTC] No.45787003{3}[source]▶

>>45780373 #

The reality is that, no, that is not possible. If a single core can render and return a web page in 16ms, what do you do when you have a million requests/sec?

The reality is most of those requests (now) get mixed in with a firehose of traffic, and could be served much faster than 16ms if that is all that was going on. But it’s never all that is going on.

↑