There’s about a factor of 3 improvement that can be made to most code after the profiler has given up. That probably means there are better profilers than could be written, but in 20 years of having them I’ve only seen 2 that tried. Sadly I think flame graphs made profiling more accessible to the unmotivated but didn’t actually improve overall results.
If you see a database query that takes 1 hour to run, and only touches a few gb of data, you should be thinking "Well nvme bandwidth is multiple gigabytes per second, why can't it run in 1 second or less?"
The idea that anyone would accept a request to a website taking longer than 30ms, (the time it takes for a game to render it's entire world including both the CPU and GPU parts at 60fps) is insane, and nobody should really accept it, but we commonly do.
Uber could run the complete global rider/driver flow from a single server.
It doesn't, in part because all of those individual trips earn $1 or more each, so it's perfectly acceptable to the business to be more more inefficient and use hundreds of servers for this task.
Similarly, a small website taking 150ms to render the page only matters if the lost productivity costs less than the engineering time to fix it, and even then, only makes sense if that engineering time isn't more productively used to add features or reliability.
Billing, serving assets like map tiles, etc. not included.
Some key things to understand:
* The scale of Uber is not that high. A big city surely has < 10,000 drivers simultaneously, probably less than 1,000.
* The driver and rider phones participate in the state keeping. They send updates every 4 seconds, but they only have to be online to start a trip. Both mobiles cache a trip log that gets uploaded when network is available.
* Since driver/rider send updates every 4 seconds, and since you don't need to be online to continue or end a trip, you don't even need an active spare for the server. A hot spare can rebuild the world state in 4 seconds. State for a rider and driver is just a few bytes each for id, position and status.
* Since you'll have the rider and driver trip logs from their phones, you don't necessarily have to log the ride server side either. Its also OK to lose a little data on the server. You can use UDP.
Don't forget that in the olden times, all the taxis in a city like New York were dispatched by humans. All the police in the city were dispatched by humans. You can replace a building of dispatchers with a good server and mobile hardware working together.
Microservices try to fix that. But then you need bin packing so microservices beget kubernetes.
What the internet will tell me is that uber has 4500 distinct services, which is more services than there are counties in the US.
The reality is most of those requests (now) get mixed in with a firehose of traffic, and could be served much faster than 16ms if that is all that was going on. But it’s never all that is going on.