The world has changed quite bit. If you have deep pockets and you can use AWS etc., it isn't a major problem anymore. However, if they indeed run it on their own data centers, that is impressive.
This is not true at all. The hard part isn't cloud vs. on-premise, it's the architecture.
Most sites can either put all their data in a single massive database, or else have an obvious dimension to shard by (e.g. by user ID if users mostly interact with their own data).
But sites where the data is many-to-many and there's a firehose of writes, of which Twitter is a prime example, are a nightmare to scale while remaining performant and reliable. Every single user gets an updated live feed of tweets drawn from every other user -- handling millions of users simultaneously is not easy.
It is definitely not easy. But the core problem of this has been discussed since the release of Facebook. There are very obvious architectures which you can follow and then fix the bottlenecks with money. The cost is still the most relevant problem, which I wanted to say. The current cloud enables much higher optimization threshold and error margin.
I think the performance of Netflix is highly dependent of ISP's data centers [1].
But yeah, there are still limits where the cloud won't help you.
If you whole infrastructure is designed to serve "historical" content instead of streaming, some bottlenecks cannot be avoided if you want to server low-latency sports stream. This came by surprise for me, but apparently betting has still significant role for the viewers.
And I don't care how many resources you have available to throw at it, plenty of sites would still fall over with the kind of growth they're having.
This is a trivial approach, which works but is suboptimal (you can cut down on the IO with various optimisations):
Shard by id. Treat it as messages queues. Think e-mail, with a lookup from public id -> internal id @ shard.
Then, additionally, every account that gets more than n followers are sharded into "sub-accounts", where posts to the main account are transparently "reposted" to the sub-accounts, just like simple mailing list reflector.
(the first obvious optimization to this is to drop propagation of posts from accounts that will hit a large proportion of users, and weave those into timelines on read/generation instead of writing them to each user; second is to drop propagation of posts to accounts that have not been accessed for a while, and instead take the expensive generation step of pulling posts to create the timeline next time they log in; there are many more, but even with the naive approach outlined above this is a solved problem)
The Open Connect Wikipedia page [1] currently claims 8,000+ Open Connect Appliances at more than 1,000 ISPs as of 2021, and OCA's in over 52 interchange points.
Netflix is shuffling data at a scale nobody outside maybe a dozen other companies globally needs to deal with, and I doubt any of the social media sites come close.