When imperfect systems are good: Bluesky's lossy timelines

(jazco.dev)

785 points cyndunlop | 1 comments | 19 Feb 25 17:48 UTC | HN request time: 0.245s | source

Show context

pornel ◴[19 Feb 25 22:25 UTC] No.43108545[source]▶

I wonder why timelines aren't implemented as a hybrid gather-scatter choosing strategy depending on account popularity (a combination of fan-out to followers and a lazy fetch of popular followed accounts when follower's timeline is served).

When you have a celebrity account, instead of fanning out every message to millions of followers' timelines, it would be cheaper to do nothing when the celebrity posts, and later when serving each follower's timeline, fetch the celebrity's posts and merge them into the timeline. When millions of followers do that, it will be cheap read-only fetch from a hot cache.

replies(5): >>43108664 #>>43108812 #>>43109007 #>>43110207 #>>43113811 #

rsynnott ◴[20 Feb 25 12:15 UTC] No.43113811[source]▶

>>43108545 #

> and later when serving each follower's timeline, fetch the celebrity's posts and merge them into the timeline

I think then you still have the 'weird user who follows hundreds of thousands of people' problem, just at read time instead of write time. It's unclear that this is _better_, though, yeah, caching might help. But if you follow every celeb on Bluesky (and I guarantee you this user exists) you'd be looking at fetching and merging _thousands_ of timelines (again, I suppose you could just throw up your hands and say "not doing that", and just skip most or all of the celebs for problem users).

Given the nature of the service, making read predictably cheap and writes potentially expensive (which seems to be the way they've gone) seems like a defensible practice.

replies(2): >>43113890 #>>43118143 #

1. fc417fc802 ◴[20 Feb 25 18:15 UTC] No.43118143[source]▶

>>43113811 #

> I suppose you could just throw up your hands and say "not doing that", and just skip most or all of the celebs for problem users

Random sampling? It's not as though the user needs thousands of posts returned for a single fetch. Scrolling down and seeing some stuff that's not in chronological order seems like an acceptable tradeoff.

↑