←back to thread

234 points benocodes | 3 comments | | HN request time: 0.436s | source
1. sandGorgon ◴[] No.41837780[source]
so how does an architecture like "2100 clusters" work. so the write apis will go to a database that contains their data ?

how is this done - like a user would have history, payments, etc. are all of them colocated in one cluster ? (which means the sharding is based on userid) ?

is there then a database router service that routes the db query to the correct database ?

replies(2): >>41837941 #>>41838866 #
2. bob1029 ◴[] No.41837941[source]
I imagine it works just like any multi-tenant SaaS product wherein you have a database per customer (region/city) with a unified web portal. The primary difference being that this is B2C and the ratio of customers per database is much greater than 1.
3. ericbarrett ◴[] No.41838866[source]
A query for a given item goes to a router*, as you said, that directs it to a given shard which holds the data. I don't know Uber's schema, but usually the data is "denormalized" and you are not doing too many JOINs etc. Probably a caching layer in front as well.

If you think this sounds more like a job for a K/V store than a relational database, well, you'd be right; this is why e.g. Facebook moved to MyRocks. But MySQL/InnoDB does a decent job and gives you features like write guarantees, transactions, and solid replication, with low write latency and no RAFT or similar nondeterministic/geographically limited protocols.

* You can also structure your data so that the shard is encoded in the lookup key so the "routing" is handled locally. Depends on your setup