Upgrading Uber's MySQL Fleet

1. sandGorgon ◴[14 Oct 24 14:18 UTC] No.41837780[source]▶

so how does an architecture like "2100 clusters" work. so the write apis will go to a database that contains their data ?

how is this done - like a user would have history, payments, etc. are all of them colocated in one cluster ? (which means the sharding is based on userid) ?

is there then a database router service that routes the db query to the correct database ?

replies(2): >>41837941 #>>41838866 #

2. bob1029 ◴[14 Oct 24 14:35 UTC] No.41837941[source]▶

>>41837780 (TP) #

I imagine it works just like any multi-tenant SaaS product wherein you have a database per customer (region/city) with a unified web portal. The primary difference being that this is B2C and the ratio of customers per database is much greater than 1.

3. ericbarrett ◴[14 Oct 24 16:07 UTC] No.41838866[source]▶

>>41837780 (TP) #

A query for a given item goes to a router*, as you said, that directs it to a given shard which holds the data. I don't know Uber's schema, but usually the data is "denormalized" and you are not doing too many JOINs etc. Probably a caching layer in front as well.

If you think this sounds more like a job for a K/V store than a relational database, well, you'd be right; this is why e.g. Facebook moved to MyRocks. But MySQL/InnoDB does a decent job and gives you features like write guarantees, transactions, and solid replication, with low write latency and no RAFT or similar nondeterministic/geographically limited protocols.

* You can also structure your data so that the shard is encoded in the lookup key so the "routing" is handled locally. Depends on your setup