←back to thread

768 points cyndunlop | 1 comments | | HN request time: 0s | source
Show context
pornel ◴[] No.43108545[source]
I wonder why timelines aren't implemented as a hybrid gather-scatter choosing strategy depending on account popularity (a combination of fan-out to followers and a lazy fetch of popular followed accounts when follower's timeline is served).

When you have a celebrity account, instead of fanning out every message to millions of followers' timelines, it would be cheaper to do nothing when the celebrity posts, and later when serving each follower's timeline, fetch the celebrity's posts and merge them into the timeline. When millions of followers do that, it will be cheap read-only fetch from a hot cache.

replies(5): >>43108664 #>>43108812 #>>43109007 #>>43110207 #>>43113811 #
locusofself ◴[] No.43108812[source]
Why do they "insert" even non-celebrity posts into each follower's timeline? That is not intuitive to me.
replies(2): >>43109032 #>>43111095 #
1. giovannibonetti ◴[] No.43109032[source]
To serve a user timeline in single-digit milliseconds, it is not practical for a data store to load each item in a different place. Even with an index, the index itself can be contiguous in disk, but the payload is scattered all over the place if you keep it in a single large table.

Instead, you can drastically speed up performance if you are able to store data for each timeline somewhat contiguously on disk.