When imperfect systems are good: Bluesky's lossy timelines

(jazco.dev)

785 points cyndunlop | 1 comments | 19 Feb 25 17:48 UTC | HN request time: 0.205s | source

Show context

pornel ◴[19 Feb 25 22:25 UTC] No.43108545[source]▶

I wonder why timelines aren't implemented as a hybrid gather-scatter choosing strategy depending on account popularity (a combination of fan-out to followers and a lazy fetch of popular followed accounts when follower's timeline is served).

When you have a celebrity account, instead of fanning out every message to millions of followers' timelines, it would be cheaper to do nothing when the celebrity posts, and later when serving each follower's timeline, fetch the celebrity's posts and merge them into the timeline. When millions of followers do that, it will be cheap read-only fetch from a hot cache.

replies(5): >>43108664 #>>43108812 #>>43109007 #>>43110207 #>>43113811 #

rubslopes ◴[19 Feb 25 23:15 UTC] No.43109007[source]▶

>>43108545 #

This problem is discussed in the beginning of the Designing Data-Intensive Applications book. It's worth a read!

replies(1): >>43111608 #

Brystephor ◴[20 Feb 25 06:01 UTC] No.43111608[source]▶

>>43109007 #

Do you know the name of the problem or strategy used for solving the problem? I'd be interested in looking it up!

I own DDIA but after a few chapters of how database work behind the scenes, I begin to fall asleep. I have trouble understanding how to apply the knowledge to my work but this seems like a useful thing with a more clear application.

replies(1): >>43113886 #

1. bitbckt ◴[20 Feb 25 12:26 UTC] No.43113886[source]▶

>>43111608 #

Yes, we used the Yahoo! “Feeding Frenzy” paper as the basis for the design of Haplocheirus (the timeline service).

↑