←back to thread

Are we decentralized yet?

(arewedecentralizedyet.online)
487 points Bogdanp | 3 comments | | HN request time: 0.001s | source
Show context
d4mi3n ◴[] No.45077410[source]
Neat! I'm not surprised at the findings here. BlueSky (for the average user) is pretty much a drop in replacement for Twitter.

Despite the smaller total numbers in Mastadon, it's great to see that the ecosystem seems to be successfully avoiding centralization like we've seen in the AT-Proto ecosystem.

I suspect that the cost of running AT proto servers/relays is prohibitive for smaller players compared to a Mastadon server selectively syndicating with a few peers, but I say this with only a vague understanding of the internals of both of these ecosystems.

replies(6): >>45077507 #>>45077986 #>>45078151 #>>45078889 #>>45079652 #>>45080382 #
danabramov ◴[] No.45077986[source]
>I suspect that the cost of running AT proto servers/relays is prohibitive for smaller players compared to a Mastadon server selectively syndicating with a few peers, but I say this with only a vague understanding of the internals of both of these ecosystems.

This isn't quite right. ATProto has a completely different "shape" so it's hard to make apples-to-apples comparison.

Roughly speaking, you can think of Mastodon as a bunch of little independently hosted copies of Twitter that "email" (loosely speaking) each other to propagate information that isn't on your server. So it's cheap to run a server for a bunch of friends but it's cut off from what's happening in the world. Your identity is tied to your server (that's your webapp), and when you want to follow someone on another server, your server essentially asks that other server to send stuff to yours. This means that by default your view of the network is extremely fragmented — replies, threads, like counts are all desynchronized and partial[1] depending on which server you're looking from and which information is being forwarded to it.

ATProto, on the other hand, is designed with a goal of actually being competitive with centralized services. This means that it's partitioned differently – it's not "many Twitters talking to each other" which is Mastodon's model. Instead, in ATProto, there is a separation of concerns: you have swappable hosting (your hosting is the source of truth for your data like posts, likes, follows, etc) and you have applications (which aggregate data from the former). This might remind you of traditional web: it's like every social media user posts JSON to "their own website" (i.e. hosting) while apps aggregate all that data, similar to how Google Reader might aggregate RSS. As a result, in ATProto, the default behavior is that everyone operates with a shared view of the world — you always see all replies, all comments, all likes are counted, etc. It's not partial by default.

With this difference in mind, "decentralizing" ATProto is sort of multidimensional. In Mastodon, the only primitive is an "instance" — i.e. an entire Twitter-like webapp you can host for your users. But in ATProto, there are multiple decentralized primitives:

- PDS (personal data hosting) is application-agnostic data store. Bluesky's implementation is open source (it uses sqlite database per user). There are also alternative implementations for the same protocol. Bluesky the company does operate the largest ones. However, running a PDS for yourself is extremely cheap (like maybe $1/mo?). It's basically just a structured KV JSON storage organized as a Merkle tree. A bit like Git hosting.

- AppViews are actual "application backends". Bluesky operates the bsky.app appview, i.e. what people know as the Bluesky app. Importantly, in ATProto, there is no reason for everyone to run their own AppView. You can run one (and it costs about $300/mo to run a Bluesky AppView ingesting all data currently on the network in real time if you want to do that). Of course, if you were happy with tradeoffs chosen by Mastodon (partial view of the network, you only see what your servers' users follow), you could run that for a lot cheaper — so that's why I'm saying it's not apples-to-apples. ATProto makes it easy to have an actually cohesive experience on the network but the costs are usually being compared with fragmented experience of Mastodon. ATProto can scale down to Mastodon-like UX (with Mastodon-like costs) but it's just not very appealing when you can have the real thing.

- Relays are things "in between" PDS's and AppViews. Essentially a Relay is just an optimization to avoid many-to-many connections between AppViews and PDS's. A Relay just rebroadcasts updates from all PDS's as a single stream (that AppViews can subscribe to). Running a Relay used to be expensive but it got a lot cheaper since "Sync 1.1" (when a change in protocol allowed Relays to be non-archiving). Now it costs about $30/mo to run a Relay.

So all in all, running PDSs and Relays is cheap. Running full AppViews is more expensive but there's simply no equivalent to that in the Mastodon world because Mastodon is always fragmented[1]. And running a partial AppView (comparable to Mastodon behavior) should be much, much cheaper — but also not very appealing so I don't know anyone who's actually doing that. (It would also require adding a bit of code to filter out the stuff you don't care about.)

[1] Mastodon is adding a workaround for this with on-demand fetching, see https://news.ycombinator.com/item?id=45078133 for my questions about that; in any case, this is limited by what you can do on-demand in a pull-based decentralized system.

replies(4): >>45078344 #>>45081740 #>>45081898 #>>45087265 #
1. Vinnl ◴[] No.45081740[source]
> [1] Mastodon is adding a workaround for this with on-demand fetching, see https://news.ycombinator.com/item?id=45078133 for my questions about that; in any case, this is limited by what you can do on-demand in a pull-based decentralized system.

I'm not super up-to-date on Mastodon's/ActivityPub's workings, but aren't replies to a post pushed to the original poster's server? So wouldn't followers then be able to pull from that server at any time to get an always-up-to-date view of replies, at least theoretically? (With maybe posts from the last few seconds missing if the network's slow.)

(Asking because I've seen you claim that the architecture is inherently limited to never be able to achieve the "cohesive" experience.)

replies(1): >>45081948 #
2. danabramov ◴[] No.45081948[source]
This only works for direct reply chains, right? It doesn’t provide a realtime view into all existing conversations that are happening on the post.

Imagine if, when you refreshed this HN page, only comment chains you’re already in would refresh timely. Yes, this would “work” to some extent, but it would clearly be a regression.

Additionally, going viral can overload your server due to this architecture. In ATProto this never happpens for self-fosters (of PDS) because the cost is amortized by AppView. (Same as in centralized products where the cost is on the backend.)

replies(1): >>45083255 #
3. Vinnl ◴[] No.45083255[source]
Ah gotcha, yes, indirect replies would need to be forwarded or something.

(To be honest, I'm already surprised that Mastodon scaled as far as it did. I will say, if I had seen the state of the web's architecture 20 years ago today, I probably also would have claimed that it was inherently insecure and that there was no way to get it to be secure enough to scale to billions of users, so... I don't know, maybe people will keep finding duct tape solutions to make it work, worse-is-better-style.)