←back to thread

768 points cyndunlop | 8 comments | | HN request time: 0.865s | source | bottom
1. dsauerbrun ◴[] No.43111919[source]
I'm a bit confused.

The lossy timeline solution basically means you skip updating the feed for some people who are above the number of reasonable followers. I get that

Seeing them get 96% improvements is insane, does that mean they have a ton of users following an unreasonable number of people or do they just have a very low number for reasonable followers. I doubt it's the latter since that would mean a lot of people would be missing updates.

How is it possible to get such massive improvements when you're only skipping a presumably small % of people per new post?

EDIT: nvm, I rethought about it, the issue is that a single user with millions of follows will constantly be written to which will slow down the fanout service when a celebrity makes a post since you're going through many db pages.

replies(4): >>43112063 #>>43112582 #>>43113226 #>>43116000 #
2. friendzis ◴[] No.43112063[source]
When a system gets "overloaded", typically it enters exponential degradation of performance state, i.e. performs self ddos.

> Seeing them get 96% improvements is insane

TFA is talking about P99 tail latencies. It does not sound too insane to reduce tail latencies by extraordinary margins. Remember, it's just reshaping of latency distribution. In this case pathological cases get dropped.

3. aloha2436 ◴[] No.43112582[source]
> does that mean they have a ton of users following an unreasonable number of people

They do, there are groups of users on bluesky who follow inordinate numbers of other accounts to try and get follows back.

4. Beretta_Vexee ◴[] No.43113226[source]
> does that mean they have a ton of users following an unreasonable number of people

Look at the accounts of OnlyFans models, crypto influencers, etc. They follow thousands or even tens of thousands of accounts in the hope that we will follow them in return.

replies(1): >>43113919 #
5. mapt ◴[] No.43113919[source]
I don't see that accommodating this behavior is prosocial or technically desirable.

Can you think of a use case?

All sorts of bots want this sort of access, but whether there are legitimate reasons to grant it to them on a non-sharded basis is another question since a lot of these queries do not scale resources with O(n) even on a centralized server architecture.

replies(2): >>43116716 #>>43117816 #
6. citrus1330 ◴[] No.43116000[source]
They were specifically looking at worst-case performance. P99 means 99th percentile, so they saw 96% improvement on the longest 1% of jobs.
7. marksomnian ◴[] No.43116716{3}[source]
From TFA:

> Generally, this can be dealt with via policy and moderation to prevent abusive users from causing outsized load on systems, but these processes take time and can be imperfect.

So it’s a case of the engineers accepting that, however hard they try to moderate, these sorts of cases will crop up and they may as well design their infrastructure to handle them.

8. tart-lemonade ◴[] No.43117816{3}[source]
Given enough time, you'll end up with a lot of legitimate users who follow a huge number of accounts but rarely interact with more than a handful, similar to how many long-time YouTubers have a very high subscriber:viewer ratio (that is, they have way more subscribers than you would expect given their average view count), and there's nothing inherently suspicious about it. People lose access to their accounts, make new accounts, die, get bored, or otherwise stop watching the content but never bother unsubscribing because the algorithm recognized this and stopped recommending the channel's uploads to them.

Bluesky doesn't have this problem yet because it's so young, so the outsized follow counts are mostly going to be from doomscrollers and outright malicious users, but even if it was exclusively malicious users, there is no perfect algorithm to identify them, much less do so before they start causing performance problems. Under those constraints, it makes sense to limit the potential blast radius and keep the site more usable for everyone.